lapply - extracting data from nested list in R -
i have download list of addresses google_reverse_code api list of places latitude , longitude information, since i'm new in r. don't know how extract useful information. code downloading databases @ bottom of question.
the structure of list in general this.
`$ 60 :list of 1 ..$ results:'data.frame': 1 obs. of 5 variables: .. ..$ address_components:list of 1 .. .. ..$ :'data.frame': 8 obs. of 3 variables: .. .. .. ..$ long_name : chr [1:8] "119" "avenida diego díaz de berlanga" "jardines de anahuac 2do sector" "san nicolás de los garza" ... .. .. .. ..$ short_name: chr [1:8] "119" "avenida diego díaz de berlanga" "jardines de anahuac 2do sector" "san nicolás de los garza" ... .. .. .. ..$ types :list of 8 .. .. .. .. ..$ : chr "street_number" .. .. .. .. ..$ : chr "route" .. .. .. .. ..$ : chr [1:3] "political" "sublocality" "sublocality_level_1" .. .. .. .. ..$ : chr [1:2] "locality" "political" .. .. .. .. ..$ : chr [1:2] "administrative_area_level_2" "political" .. .. .. .. ..$ : chr [1:2] "administrative_area_level_1" "political" .. .. .. .. ..$ : chr [1:2] "country" "political" .. .. .. .. ..$ : chr "postal_code" .. ..$ formatted_address : chr "avenida diego díaz de berlanga 119, jardines de anahuac 2do sector, 66444 san nicolás de los garza, n.l., mexico" .. ..$ geometry :'data.frame': 1 obs. of 3 variables: .. .. ..$ location :'data.frame': 1 obs. of 2 variables: .. .. .. ..$ lat: num 25.7 .. .. .. ..$ lng: num -100 .. .. ..$ location_type: chr "rooftop" .. .. ..$ viewport :'data.frame': 1 obs. of 2 variables: .. .. .. ..$ northeast:'data.frame': 1 obs. of 2 variables: .. .. .. .. ..$ lat: num 25.7 .. .. .. .. ..$ lng: num -100 .. .. .. ..$ southwest:'data.frame': 1 obs. of 2 variables: .. .. .. .. ..$ lat: num 25.7 .. .. .. .. ..$ lng: num -100 .. ..$ place_id : chr "chijry_wpdquyoyrtjett6ajeta" .. ..$ types :list of 1 .. .. ..$ : chr "street_address"
i need information data frame perform analysis. information c(latitude, longitude, formatted_address, place_id)
the code have written this:
prueba <- sapply(direccion1, function(x){ uno <- unlist(x[[1]]) })
pureba2 <- data.frame(prueba)
i following error : error in (function (..., row.names = null, check.rows = false, check.names = true, : arguments imply differing number of rows: 40, 32, 37, 44, 36, 0, 41, 28, 39, 47, 43, 35, 48
among other code not work.
the code downloading data contains longitude , latitude following.
# cre files library(easypackages) my_packages <- c("ggmap","maps","mapdata","rlist","readr", "tidyverse", "lubridate", "stringr", "rebus", "stringi", "purrr", "geosphere", "xml", "rcurl", "xml2") libraries(my_packages) # set link website link1 <- ("https://publicacionexterna.azurewebsites.net/publicaciones/prices") # data webpage data_prices <- geturl(link1) # parse xml data xmlfile <- xmlparse(data_prices) # place nodes places <- getnodeset(xmlfile, "//place") # values each place values <- lapply(places, function(x){ # current place id p_id <- xmlattrs(x) # values each gas type current place newrows <- lapply(xmlchildren(x), function(y){ # type , update time attrs <- xmlattrs(y) # price value price <- xmlvalue(y) names(price) <- "price" # return values return(c(p_id, attrs, price) ) }) # combine rows single list newrows <- do.call(rbind, newrows) # return rows return(newrows) }) # combine values single dataframe datosdeprecios <- as.data.frame(do.call(rbind, values), stringsasfactors = false) # re-set row names dataframe row.names(datosdeprecios) <- c(1:nrow(datosdeprecios)) # set link website places file link2 <- ("https://publicacionexterna.azurewebsites.net/publicaciones/places") data_places <- read_xml(link2) datos_id <- data_places %>% xml_find_all("//place") %>% xml_attr("place_id") datos_name <- data_places %>% xml_find_all("//name") %>% xml_text("name") datos_brand <- data_places %>% xml_find_all("//brand") %>% xml_text("brand") datos_cre_id <- data_places %>% xml_find_all("//cre_id") %>% xml_text("cre_id") datos_category <- data_places %>% xml_find_all("//category") %>% xml_text("category") datos_adress_street <- data_places %>% xml_find_all("//address_street") %>% xml_text("adress_street") datos_longitud <- data_places %>% xml_find_all("//x") %>% xml_text("x") datos_latitud <- data_places %>% xml_find_all("//y") %>% xml_text("y") datosdelugares <- data.frame(datos_id, datos_name, datos_brand, datos_cre_id, datos_category, datos_adress_street, datos_latitud, datos_longitud) colnames(datosdelugares) <- c("place_id", "name", "brand","cre_id", "category", "adress_street", "latitude", "longitude") rm(data_prices,places,values,xmlfile,data_places, datos_adress_street, datos_brand, datos_category, datos_cre_id, datos_id, datos_name, datos_longitud, datos_latitud) rm(results, results2)
the code getting address information following.
datosdeprecios <- datosdeprecios %>% data.frame(datosdeprecios) %>% mutate(place_id = as.numeric(place_id)) datosdelugares <- datosdelugares %>% data.frame(datosdelugares) %>% mutate(place_id = as.numeric(place_id)) basegeneral <- inner_join(datosdelugares, datosdeprecios, = "place_id") basegeneral <- basegeneral %>% select(latitude, longitude, place_id) %>% mutate(latitude = as.numeric(as.character(latitude))) %>% mutate(longitude = as.numeric(as.character(longitude))) basegeneral <- basegeneral[1:100,] basegeneral <- apply(basegeneral,1 ,function(x) { google_reverse_geocode(location = c(x["latitude"],x["longitude"]), key = key, result_type = "street_address") })
thank help. :)
you can extract information lists using either [[
notation, or $
if take example given in ?google_reverse_geocode
result
library(googleway) res <- google_reverse_geocode(location = c(-37.81659, 144.9841), result_type = c("street_address"), location_type = "rooftop", key = key)
the lat/lon information in res$results$geometry$location
the formatted address in res$results$formatted_address
and place_id in res$results$place_id
so can create data.frame
these elements
data.frame( lat = res$results$geometry$location$lat, lon = res$results$geometry$location$lng, formatted_address = res$results$formatted_address, place_id = res$results$place_id )
if had multiple lists of results, process similar, need wrap in *apply
function (or whatever looping mechanism prefer)
## list of locations locations <- list(c(-37.81659, 144.9841), c(-37.81827, 144.9671)) ## generating reverse geocode each location lst_res <- lapply(locations, function(x){ google_reverse_geocode(location = x, key = key) })
here, lst_res
list of results geocoding function, can iterate on extract relevant parts
## can extract information lst_df <- lapply(lst_res, function(x){ data.frame( lat = x[['results']][['geometry']][['location']][['lat']], lon = x[['results']][['geometry']][['location']][['lng']], formatted_address = x[['results']][['formatted_address']], place_id = x[['results']][['place_id']] ) })
here, lst_df
list of data.frames. if want join them 1 single data.frame can
df <- do.call(rbind, lst_df) ## et voila! head(df) # lat lon # 1 -37.81647 144.9841 # 2 -37.81659 144.9841 # 3 -37.81300 144.9850 # 4 -37.81363 144.9631 # 5 -37.81614 144.9805 # 6 -37.81005 144.9281 # formatted_address # 1 jolimont station, 175 wellington parade, east melbourne vic 3002, austalia # 2 jolimont station, wellington cres, east melbourne vic 3002, australia # 3 east melbourne vic 3002, australia # 4 melbourne vic, australia # 5 east melbourne vic 3002, australia # 6 melbourne, vic, australia # place_id # 1 chijsxaubopc1morqhrunmozv4m # 2 chijidtrbupc1mormpt0cxzwbb0 # 3 chijz25svmfc1moraoimixvwbau # 4 chij90260rvg1morkm2mixvwbaq # 5 chijg74w4upd1morsdqurnhwbbw # 6 chijv_fygknd1morpxlurxzurfs
Comments
Post a Comment