r - Summarize data and keep date column value -
i asked similar question before , got excellent answer, needed more guidance on topic of summarizing , dates. summarize , count data in r dplyr
goal:
in new dataset have column dates, when event occured. when want proceed in example suggested in other post, error message:
dataset:
structure(list(user = c(1l, 1l, 1l, 1l, 1l, 1l, 1l, 2l, 2l, 2l, 2l, 2l, 2l), date = c("25.11.2015 13:59", "03.12.2015 09:32", "07.12.2015 08:18", "08.12.2015 19:40", "08.12.2015 19:40", "22.12.2015 08:50", "22.12.2015 08:52", "05.01.2016 13:22", "06.01.2016 09:18", "14.02.2016 22:47", "20.02.2016 21:27", "01.04.2016 13:52", "24.07.2016 07:03"), stimulia = c(0l, 0l, 1l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 1l), stimulib = c(0l, 0l, 0l, 0l, 1l, 0l, 0l, 0l, 0l, 1l, 0l, 0l, 0l), r2 = c(1l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 1l, 1l, 0l), r3 = c(0l, 0l, 0l, 0l, 0l, 1l, 0l, 0l, 1l, 0l, 0l, 0l, 0l), r4 = c(0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l), r5 = c(0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l), r6 = c(0l, 0l, 0l, 1l, 0l, 0l, 0l, 1l, 0l, 0l, 0l, 0l, 0l), r7 = c(0l, 1l, 0l, 0l, 0l, 0l, 1l, 0l, 0l, 0l, 0l, 0l, 0l), stims = c("0_0", "0_0", "1_0", "1_0", "1_1", "1_1", "1_1", "1_1", "1_1", "1_2", "1_2", "1_2", "2_2")), .names = c("user", "date", "stimulia", "stimulib", "r2", "r3", "r4", "r5", "r6", "r7", "stims"), row.names = c(na, -13l), spec = structure(list( cols = structure(list(user = structure(list(), class = c("collector_integer", "collector")), date = structure(list(), class = c("collector_character", "collector")), stimulia = structure(list(), class = c("collector_integer", "collector")), stimulib = structure(list(), class = c("collector_integer", "collector")), r2 = structure(list(), class = c("collector_integer", "collector")), r3 = structure(list(), class = c("collector_integer", "collector")), r4 = structure(list(), class = c("collector_integer", "collector")), r5 = structure(list(), class = c("collector_integer", "collector")), r6 = structure(list(), class = c("collector_integer", "collector")), r7 = structure(list(), class = c("collector_integer", "collector"))), .names = c("user", "date", "stimulia", "stimulib", "r2", "r3", "r4", "r5", "r6", "r7")), default = structure(list(), class = c("collector_guess", "collector"))), .names = c("cols", "default"), class = "col_spec"), class = c("tbl_df", "tbl", "data.frame"))
code:
df$stims <- with(df, paste(cumsum(stimulia), cumsum(stimulib), sep="_")) aggregate(. ~ user + stims, data=df, sum) error in summary.factor(c(12l, 2l), na.rm = false) : ‘sum’ not meaningful factors
question/desired result: in result, keep date of when stimuli occured (or when stimuli , b 0, of first date of specific user)
user date stimulia stimulib r2 r3 r4 r5 r6 r7 1 25.11.2015 13:59 0 0 1 0 0 0 0 1 1 07.12.2015 08:18 1 0 0 0 0 0 1 0 1 08.12.2015 19:40 0 1 0 2 0 0 1 1 2 05.01.2016 13:22 0 0 0 0 0 0 1 0 2 14.02.2016 22:47 0 1 2 0 0 0 0 0 2 24.07.2016 07:03 1 0 0 0 0 0 0 0
in result table, have sum of values (r2-r7), when stimuli , b still 0. [line1] each stimuli, there sum of r2-r7 noted until next stimuli occurs.
this suggested in previous post, unable make work:
you don't want work dates factors. transform date date variable using as.date (many posts on over so). 1 method separately aggregate date variable user , stims similar above, taking min rather sum. merge 2 resulting data.frames. if not make sense, might worth asking new question links question, adding additional problem of date variable. include example dataset includes variable @lmo
one idea via dplyr
filter non-stimuli , grab first observation each user (via slice
). filter stimuli , bind_rows
, i.e.
library(dplyr) bind_rows( df %>% filter(rowsums(.[3:4]) == 0) %>% group_by(user) %>% slice(1l), df %>% filter(rowsums(.[3:4]) != 0)) %>% arrange(user)
which gives,
# tibble: 6 x 11 # groups: user [2] user date stimulia stimulib r2 r3 r4 r5 r6 r7 stims <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int> <chr> 1 1 25.11.2015 13:59 0 0 1 0 0 0 0 0 0_0 2 1 07.12.2015 08:18 1 0 0 0 0 0 0 0 1_0 3 1 08.12.2015 19:40 0 1 0 0 0 0 0 0 1_1 4 2 05.01.2016 13:22 0 0 0 0 0 0 1 0 1_1 5 2 14.02.2016 22:47 0 1 0 0 0 0 0 0 1_2 6 2 24.07.2016 07:03 1 0 0 0 0 0 0 0 2_2
Comments
Post a Comment