r - 'data.table::set' only works after assigning without set first. Working with data.table of data.tables -
i using data.table store data in string format. strings hold information want retrieve using function. function, in real script, multiple calculations , parsing, , @ end returns data.table many columns , many rows. function receives whole row of original data.table argument (all variables used): myfun(dt[rownumber, ]
while columns of original data.table still used later in script, 1 of variables in data.table expendable after processing, want replace variable data.table function. allows me keep link between remaining variables , new data.table, can later pass other functions.
however, since working many rows, want speed things using data.table::set function update cell, r won't allow me use:
data.table::set(dt, i=rownum, j=colnum, value = list(list(myfun(dt[rownum, ])))
if firstly, don't do:
dt$somevar[1l] <- list(myfun(dt[1l, ]))
this following warning using set
in data.table::set(dt, = rownum, j = colnum, value = list(list(myfun(dt[rownum, : coerced 'list' rhs 'character' match column's type. either change target column 'list' first (by creating new 'list' vector length 3 (nrows of entire table) , assign that; i.e. 'replace' column), or coerce rhs 'character' (e.g. 1l, na_[real|integer]_, as.*, etc) make intent clear , speed. or, set column type correctly front when create table , stick it, please.
i receive same warning when using solely:
dt[rownum, ((names(dt))[colnum]) := list(list(myfun(dt[rownum, ])))]
here clear illustrative example (not real problem) of issue facing:
col1 <- as.character(1:3) col2 <- as.character(4:6) col3 <- as.character(7:9) dt <- data.table::data.table(var1 = col1, var2 = col2, var3 = col3) myfun <- function(rowdt) { v1 <- as.numeric(rowdt$var1[1]) v2 <- as.numeric(rowdt$var2[1]) v3 <- as.numeric(rowdt$var3[1]) col1 <- c(v1*v2, v1*v3) col2 <- c(v2*v2, v2*v3) return(data.table::data.table(var1 = col1, var2 = col2)) } colnum = 3l (rownum in 1l:nrow(dt)) { data.table::set(dt, i=rownum, j=colnum, value = list(list(myfun(dt[rownum, ])))) } the above code yields previous warning message, howwever, works:
colnum = 3l dt$var3[1l] <- list(myfun(dt[1l, ])) (rownum in 2l:nrow(dt)) { data.table::set(dt, i=rownum, j=colnum, value = list(list(myfun(dt[rownum, ])))) } is expected behavior? if is, why happen , how take advantage of data.table::set higher performance using it?
data.table doesn't coercing entire column class because 1 of values changes. it's totally fine if replace entire column. , jives better r's vectorized approach.
myfun <- function(wholedt) { dt_copy <- copy(wholedt) dt_copy[, ':='( var1 = as.numeric(var1), var2 = as.numeric(var2), var3 = as.numeric(var3) )][, ':='( col1 = mapply(fun = c, var1 * var2, var1 * var3, simplify = false), col2 = mapply(fun = c, var2 * var2, var2 * var3, simplify = false) )][ , var3 := apply(.sd, margin = 1l, as.data.table), .sdcols = c('col1', 'col2') ] # remove intermediate columns set(dt_copy, j = c('col1', 'col2'), value = null) dt_copy } dt_new <- myfun(dt) dt_new # var1 var2 var3 # 1: 1 4 <data.table> # 2: 2 5 <data.table> # 3: 3 6 <data.table> dt_new$var3[[1]] # col1 col2 # 1: 4 16 # 2: 7 28 i took function, had work column-wise instead of row-wise, , used mapply list columns.
the esoteric part apply call. runs through each row of subdata (which columns col1 , col2, .sdcols). each row passed list, perfect as.data.table.
i don't using apply, though, there's more efficient solution. until points out, should work want.
Comments
Post a Comment