r - How to efficiently swap elements between columns in a dataframe? -

September 15, 2010

i asked similar question before, realized previous example little special in sense factor levels equally-spaced. here want reframe question in more generic way, , solutions in old thread not work properly.

suppose have following dataframe in r:

set.seed(1) (tmp <- data.frame(x = 1:10, r1 = sample(c('a','d','f','g','i'), 10, replace = true), r2 = sample(c('d','f','g','i','z'), 10, replace = true), stringsasfactors=false))      x r1 r2 1   1  d  f 2   2  d  d 3   3  f  4   4   f 5   5  d  6   6   g 7   7   8   8  g  z 9   9  g  f 10 10

notice 2 columns r1 , r2 not share same elements. want following: if difference between elemet index (sequential order among elements) of column r1 , of column r2 odd number, levels of 2 factors need switched between them, can performed through following code:

for(ii in 1:dim(tmp)[1]) {    kk <- which(levels(as.factor(tmp$r2)) %in% tmp[ii,'r2'], arr.ind = true) - which(levels(as.factor(tmp$r1)) %in% tmp[ii,'r1'], arr.ind = true)    if(kk%%2!=0) { # swap elements between 2 columns       qq <- tmp[ii,]$r1       tmp[ii,]$r1 <- tmp[ii,]$r2       tmp[ii,]$r2 <- qq   } }

as 2 columns r1 , r2 don't share same elements, purposefully created dataframe tmp r1 , r2 not factors swamp elements between 2 columns kludge code above. below output after swapping:

    x r1 r2 1   1  d  f 2   2  d  d 3   3   f 4   4   f 5   5  d  6   6  g  7   7   8   8  z  g 9   9  f  g 10 10

my solution awkward , slow big dataframe. elegant way perform operation?

# convert character dat[, c("r1", "r2")] <- lapply(dat[, c("r1", "r2")], as.character)

next, vectorize row-change condition. true elements rows evaluated , swapped if necessary.

# logical inidcator elements change changeind <- !!((match(dat$r2, levels(as.factor(dat$r2))) -                 match(dat$r1, levels(as.factor(dat$r1)))) %% 2)  # perform swapping given rows dat[changeind, c("r1", "r2")] <- dat[changeind, c("r2", "r1")]

here, use match select rows changes needed. after this, perform simple swapping of variables [.

this returns

dat     x r1 r2 1   1  d  f 2   2  d  d 3   3  f  4   4  f  5   5  d  6   6  g  7   7   8   8  g  z 9   9  f  g 10 10

note there may typo in desired output. since

identical((sapply(seq_len(nrow(dat)),            function(x) which(levels(as.factor(dat$r2)) %in% dat[x,'r2'], arr.ind = true) -                        which(levels(as.factor(dat$r1)) %in% dat[x,'r1'], arr.ind = true)) %% 2) != 0,           changeind) [1] true

data

dat <- structure(list(x = 1:10, r1 = structure(c(1l, 1l, 4l, 4l, 1l,  3l, 4l, 5l, 2l, 4l), .label = c("d", "f", "g", "i", "z"), class = "factor"),      r2 = structure(c(3l, 2l, 3l, 3l, 5l, 5l, 5l, 4l, 4l, 1l), .label = c("a",      "d", "f", "g", "i"), class = "factor")), .names = c("x",  "r1", "r2"), class = "data.frame", row.names = c("1", "2", "3",  "4", "5", "6", "7", "8", "9", "10"))

Search This Blog

RT

r - How to efficiently swap elements between columns in a dataframe? -

data

Comments

Post a Comment

Popular posts from this blog

Ansible warning on jinja2 braces on when -

Parsing a protocol message from Go by Java -

javascript - Replicate keyboard event with html button -