r - Iterating over columns in a data frame in order to replace values from matching data in list of data frames -


i'm interested in building function making use of apply/sapply or map iterate on available columns in dta , replace values in each column matched values data frame available in nameless list of data frames list item index corresponding column number of dta data frame.

example

given objects:

set.seed(1) size <- 20  # data set dta <-     data.frame(         unita = sample(letters[1:4], size = size, replace = true),         unitb = sample(letters[16:20], size = size, replace = true),         unitc = sample(month.abb[1:4], size = size, replace = true),         somevalue = sample(1:1e6, size = size, replace = true)     )  # meta data lstmeta <- list(     # unit definitions     data.frame(         v1 = c("a", "b", "d"),         v2 = c("letter a", "letter b", "letter d")     ),     # unit b definitions     data.frame(         v1 = c("t", "q"),         v2 = c("small t", "small q")     ),     # unit c definitions     data.frame(         v1 = c("mar", "jan"),         v2 = c("march", "january")     ) ) 

desired results

when applied on dta, function should return data.frame corresponding extract below:

unita       unitb    unitc      somevalue letter b    small t  apr        912876 letter b    small q  march      293604        c    s        apr        459066 letter d    p        march      332395 letter    small q  march      650871 letter d    small q  apr        258017 letter d    p        january    478546 c           small q  feb        766311 c           small t  march      84247 letter    small q  march      875322 letter    r        feb        339073 letter    r        ap         839441 c           r        feb        346684 letter b    p        january    333775 letter d    small t  january    476352 (...) 

existing approach

replacelbls <- function(dataset, lstdict) {     sapply(seq_along(dataset), function(i) {         # take corresponding metadata data frame         dtadict <- lstdict[[i]]          # replace values in selected column         # matches on v1 push corrsponding values v2         dataset[,i][match(dataset[,i], dtadict[,1])] <- dtadict[,2][match(dtadict[,1], dataset[,i])]       }) }  # testing -----------------------------------------------------------------  replacelbls(dataset = dta, lstdict = lstmeta) 

of course approach proposed above not work try use na in assignments; summarises want achieve:

error in x[...] <- m : nas not allowed in subscripted assignments in addition: warning message: in [<-.factor(*tmp*, match(dataset[, i], dtadict[, 1]), value = c(na, : invalid factor level, na generated

additional remarks

source data set

the key characteristics of data are:

  • the list nameless subsetting has done item numbers not names
  • item number correspond column numbers
  • there no full match between metadata data frames available in list of data frames , unit columns available in data
  • the somevalue column should iterated on may contain labels should replaced

solution

  • i'm not interested in dplyr/data.table/sqldf-based solutions.
  • i'm not interested in nested for-loops

the following approach works example data:

replacelbls <- function(dataset, lstdict) {   dataset[seq_along(lstdict)] <- map(function(x, lst) {     x <- as.character(x)     idx <- match(x, as.character(lst$v1))     replace(x, !is.na(idx), as.character(lst$v2)[na.omit(idx)])   }, dataset[seq_along(lstdict)], lstdict)   dataset }   head(replacelbls(dta, lstmeta)) #      unita   unitb unitc somevalue # 1 letter b small t   apr    912876 # 2 letter b small q march    293604 # 3        c       s   apr    459066 # 4 letter d       p march    332395 # 5 letter small q march    650871 # 6 letter d small q   apr    258017 

this assumes want apply changes first x column of data long meta-list. might want include step convert factor since approach converts adjusted columns character class.

another remark on factors: potentially speed performance working on levels of factor variables instead whole column. general process similar requires few more steps check classes etc.


Comments

Popular posts from this blog

Ansible warning on jinja2 braces on when -

Parsing a protocol message from Go by Java -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -