r - Creating a new column where each element is the count of subsets of two other columns, loop free -
this question has answer here:
i have data frame looks table below keeps track of person visiting store in month. want create new column, total_visits, count of number of times id visited store during month. in below example, date 6-13 , id 23, total_visits have 3 in row date == 6-13, , id == 23.
date id 6-13 23 6-13 34 6-13 23 6-13 23 7-13 23
data frame i'm looking
date id total_visits 6-13 23 3 6-13 34 1 6-13 23 3 6-13 23 3 7-13 23 1
while assume there sort of acast function ensure don't have loop through (30,000 rows), ok loop if vectorization did not work.
you can use dplyr
package:
library(dplyr) df %>% group_by(date, id) %>% mutate(total_visits = n()) # # tibble: 5 x 3 # # groups: date, id [3] # date id total_visits # <fctr> <int> <int> # 1 6-13 23 3 # 2 6-13 34 1 # 3 6-13 23 3 # 4 6-13 23 3 # 5 7-13 23 1
use data.frame
on output make dataframe.
update:
or using data.table
package:
library(data.table) setdt(df)[, total_visits:=.n, by=c("date","id")] df # date id total_visits # 1: 6-13 23 3 # 2: 6-13 34 1 # 3: 6-13 23 3 # 4: 6-13 23 3 # 5: 7-13 23 1
data:
df <- structure(list(date = structure(c(1l, 1l, 1l, 1l, 2l), .label = c("6-13", "7-13"), class = "factor"), id = c(23l, 34l, 23l, 23l, 23l)), .names = c("date", "id"), class = "data.frame", row.names = c(na, -5l))
Comments
Post a Comment