r - Extracting data from a list of lists into its own `data.frame` with `purrr` -

June 15, 2014

representative sample data (list of lists):

l <- list(structure(list(a = -1.54676469632688, b = "s", c = "t",  d = structure(list(id = 5l, label = "utah", link = "asia/anadyr",      score = -0.21104594634643), .names = c("id", "label",  "link", "score")), e = 49.1279871269422), .names = c("a",  "b", "c", "d", "e")), structure(list(a = -0.934821052832427,  b = "k", c = "t", d = list(structure(list(id = 8l, label = "south carolina",      link = "pacific/wallis", score = 0.526540892113734, externalid = -6.74354377676955), .names = c("id",  "label", "link", "score", "externalid")), structure(list(     id = 9l, label = "nebraska", link = "america/scoresbysund",      score = 0.250895465294041, externalid = 16.4257470807879), .names = c("id",  "label", "link", "score", "externalid"))), e = 52.3161400117052), .names = c("a",  "b", "c", "d", "e")), structure(list(a = -0.27261485993069, b = "f",  c = "p", d = list(structure(list(id = 8l, label = "georgia",      link = "america/nome", score = 0.526494135483816, externalid = 7.91583574935589), .names = c("id",  "label", "link", "score", "externalid")), structure(list(     id = 2l, label = "washington", link = "america/shiprock",      score = -0.555186440792989, externalid = 15.0686663219837), .names = c("id",  "label", "link", "score", "externalid")), structure(list(     id = 6l, label = "north dakota", link = "universal",      score = 1.03168296038975), .names = c("id", "label",  "link", "score")), structure(list(id = 1l, label = "new hampshire",      link = "america/cordoba", score = 1.21582056168681, externalid = 9.7276418869132), .names = c("id",  "label", "link", "score", "externalid")), structure(list(     id = 1l, label = "alaska", link = "asia/istanbul", score = -0.23183264861979), .names = c("id",  "label", "link", "score")), structure(list(id = 4l, label = "pennsylvania",      link = "africa/dar_es_salaam", score = 0.590245339334121), .names = c("id",  "label", "link", "score"))), e = 132.1153538536), .names = c("a",  "b", "c", "d", "e")), structure(list(a = 0.202685974077313, b = "x",  c = "o", d = structure(list(id = 3l, label = "delaware",      link = "asia/samarkand", score = 0.695577130634724, externalid = 15.2364820698193), .names = c("id",  "label", "link", "score", "externalid")), e = 97.9908914452971), .names = c("a",  "b", "c", "d", "e")), structure(list(a = -0.396243444741009,  b = "z", c = "p", d = list(structure(list(id = 4l, label = "north dakota",      link = "america/tortola", score = 1.03060272795705, externalid = -7.21666936522344), .names = c("id",  "label", "link", "score", "externalid")), structure(list(     id = 9l, label = "nebraska", link = "america/ojinaga",      score = -1.11397997280413, externalid = -8.45145052697411), .names = c("id",  "label", "link", "score", "externalid"))), e = 123.597945533926), .names = c("a",  "b", "c", "d", "e")))

i have list of lists, virtue of json data download.

the list has 176 elements, each 33 nested elements of lists of varying length.

i interested in analyzing data contained in particular nested list, has length of ~150 each of 176 has either 4 or 5 elements -- have 4 , have 5. trying extract nested list of interest , convert data.frame able perform analysis.

in representative sample data above, interested in nested list d each of 5 elements of l. desired data.frame therefore like:

id           label            link       score  externalid  5            utah     asia/anadyr  -0.2110459          na  8  south carolina  pacific/wallis   0.5265409   -6.743544  .  .

i've been attempting use purrr appears have sensible , consistent flow processing data in lists, running errors can't understand cause of -- don't understand commands/logic of purrr or lists (likely both). code i've been attempting throws associated error:

df <- map_df(l, "d", ~as.data.frame(.)) error: incompatible sizes (5 != 4)

i believe has differing lengths of d each component, or perhaps differing contained data (sometimes 4 elements 5) or perhaps function i've used here misspecified -- truthfully i'm not entirely sure.

i have worked around using loop, know inefficient , hence question here on so.

this loop employ:

df <- data.frame(id = integer(), label = character(), score = numeric(), externalid = numeric()) for(i in seq_along(l)){     df_temp <- l[[i]][[4]] %>% map_df(~as.data.frame(.))     df <- rbind(df, df_temp) }

some assistance preferably purrr - alternatively version of apply still superior for-loop - appreciated. if there's resource above i'd understand rather find right code.

you can in 3 steps, first pulling out d, binding rows within each element of d, , binding single object.

i use bind_rows dplyr within-list row binding. map_df final row binding.

library(purrr) library(dplyr)  l %>%     map("d") %>%     map_df(bind_rows)

this equivalent:

map_df(l, ~bind_rows(.x[["d"]] ) )

the result looks like:

# tibble: 12 x 5       id          label                 link      score externalid    <int>          <chr>                <chr>      <dbl>      <dbl>  1     5           utah          asia/anadyr -0.2110459         na  2     8 south carolina       pacific/wallis  0.5265409  -6.743544  3     9       nebraska america/scoresbysund  0.2508955  16.425747  4     8        georgia         america/nome  0.5264941   7.915836  5     2     washington     america/shiprock -0.5551864  15.068666  6     6   north dakota            universal  1.0316830         na  7     1  new hampshire      america/cordoba  1.2158206   9.727642  8     1         alaska        asia/istanbul -0.2318326         na  9     4   pennsylvania africa/dar_es_salaam  0.5902453         na 10     3       delaware       asia/samarkand  0.6955771  15.236482 11     4   north dakota      america/tortola  1.0306027  -7.216669 12     9       nebraska      america/ojinaga -1.1139800  -8.451451

Search This Blog

RT

r - Extracting data from a list of lists into its own `data.frame` with `purrr` -

Comments

Post a Comment

Popular posts from this blog

javascript - Replicate keyboard event with html button -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Web audio api 5.1 surround example not working in firefox -