r - Summarize data and keep date column value -

July 15, 2010

i asked similar question before , got excellent answer, needed more guidance on topic of summarizing , dates. summarize , count data in r dplyr

goal:

in new dataset have column dates, when event occured. when want proceed in example suggested in other post, error message:

dataset:

structure(list(user = c(1l, 1l, 1l, 1l, 1l, 1l, 1l, 2l, 2l, 2l,  2l, 2l, 2l), date = c("25.11.2015 13:59", "03.12.2015 09:32",  "07.12.2015 08:18", "08.12.2015 19:40", "08.12.2015 19:40", "22.12.2015 08:50",  "22.12.2015 08:52", "05.01.2016 13:22",  "06.01.2016 09:18", "14.02.2016 22:47",   "20.02.2016 21:27", "01.04.2016 13:52", "24.07.2016 07:03"),      stimulia = c(0l, 0l, 1l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l,      0l, 1l), stimulib = c(0l, 0l, 0l, 0l, 1l, 0l, 0l, 0l, 0l,      1l, 0l, 0l, 0l), r2 = c(1l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l,      0l, 1l, 1l, 0l), r3 = c(0l, 0l, 0l, 0l, 0l, 1l, 0l, 0l, 1l,      0l, 0l, 0l, 0l), r4 = c(0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l,      0l, 0l, 0l, 0l), r5 = c(0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l, 0l,      0l, 0l, 0l, 0l), r6 = c(0l, 0l, 0l, 1l, 0l, 0l, 0l, 1l, 0l,      0l, 0l, 0l, 0l), r7 = c(0l, 1l, 0l, 0l, 0l, 0l, 1l, 0l, 0l,      0l, 0l, 0l, 0l), stims = c("0_0", "0_0", "1_0", "1_0", "1_1",      "1_1", "1_1", "1_1", "1_1", "1_2", "1_2", "1_2", "2_2")), .names = c("user",  "date", "stimulia", "stimulib", "r2", "r3", "r4", "r5", "r6",  "r7", "stims"), row.names = c(na, -13l), spec = structure(list(     cols = structure(list(user = structure(list(), class = c("collector_integer",      "collector")), date = structure(list(), class = c("collector_character",      "collector")), stimulia = structure(list(), class = c("collector_integer",      "collector")), stimulib = structure(list(), class = c("collector_integer",      "collector")), r2 = structure(list(), class = c("collector_integer",      "collector")), r3 = structure(list(), class = c("collector_integer",      "collector")), r4 = structure(list(), class = c("collector_integer",      "collector")), r5 = structure(list(), class = c("collector_integer",      "collector")), r6 = structure(list(), class = c("collector_integer",      "collector")), r7 = structure(list(), class = c("collector_integer",      "collector"))), .names = c("user", "date", "stimulia", "stimulib",      "r2", "r3", "r4", "r5", "r6", "r7")), default = structure(list(), class = c("collector_guess",      "collector"))), .names = c("cols", "default"), class = "col_spec"), class = c("tbl_df",  "tbl", "data.frame"))

code:

df$stims <- with(df, paste(cumsum(stimulia), cumsum(stimulib), sep="_"))     aggregate(. ~ user + stims, data=df, sum) error in summary.factor(c(12l, 2l), na.rm = false) :  ‘sum’ not meaningful factors

question/desired result: in result, keep date of when stimuli occured (or when stimuli , b 0, of first date of specific user)

user    date         stimulia   stimulib    r2  r3  r4  r5  r6  r7  1  25.11.2015 13:59     0         0        1   0   0   0   0   1  1  07.12.2015 08:18     1         0        0   0   0   0   1   0  1  08.12.2015 19:40     0         1        0   2   0   0   1   1  2  05.01.2016 13:22     0         0        0   0   0   0   1   0   2  14.02.2016 22:47     0         1        2   0   0   0   0   0  2  24.07.2016 07:03     1         0        0   0   0   0   0   0

in result table, have sum of values (r2-r7), when stimuli , b still 0. [line1] each stimuli, there sum of r2-r7 noted until next stimuli occurs.

this suggested in previous post, unable make work:

you don't want work dates factors. transform date date variable using as.date (many posts on over so). 1 method separately aggregate date variable user , stims similar above, taking min rather sum. merge 2 resulting data.frames. if not make sense, might worth asking new question links question, adding additional problem of date variable. include example dataset includes variable @lmo

one idea via dplyr filter non-stimuli , grab first observation each user (via slice). filter stimuli , bind_rows, i.e.

library(dplyr)  bind_rows(   df %>%      filter(rowsums(.[3:4]) == 0) %>%      group_by(user) %>%     slice(1l),    df %>%      filter(rowsums(.[3:4]) != 0)) %>%    arrange(user)

which gives,

# tibble: 6 x 11 # groups:   user [2]    user             date stimulia stimulib    r2    r3    r4    r5    r6    r7 stims   <int>            <chr>    <int>    <int> <int> <int> <int> <int> <int> <int> <chr> 1     1 25.11.2015 13:59        0        0     1     0     0     0     0     0   0_0 2     1 07.12.2015 08:18        1        0     0     0     0     0     0     0   1_0 3     1 08.12.2015 19:40        0        1     0     0     0     0     0     0   1_1 4     2 05.01.2016 13:22        0        0     0     0     0     0     1     0   1_1 5     2 14.02.2016 22:47        0        1     0     0     0     0     0     0   1_2 6     2 24.07.2016 07:03        1        0     0     0     0     0     0     0   2_2

Search This Blog

RT

r - Summarize data and keep date column value -

Comments

Post a Comment

Popular posts from this blog

javascript - Replicate keyboard event with html button -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

Ansible warning on jinja2 braces on when -