dplyr - Conditional counting R. Add if both rows==TRUE -
i have data frame of bacterial colony counts (accn) 2 different methods of sampling: swabs , plates. i'd count times when colony count agree both methods series of standards (e.g. if accn<="2.5", etc.).
head(ea)   sample group     accn 1      e     1 14.84500 2      s     1  2.07500 3      e     2 13.70167 4      s     2  6.60000 5      e     3 11.45833 6      s     3  7.90000 so far i've got:
s<-(ea$accn<="2.5" & ea$sample=="s") p<-(ea$accn<="2.5" & ea$sample=="p") pe<-cbind(s,p) pe<-as.data.frame(pe)  sum(pe) but receive error: error in fun(x[[i]], ...) : defined on data frame numeric variables
with dplyr:
library(dplyr)  ea %>%   mutate(s = ifelse(as.numeric(accn) <= 2.5 & sample == "s", 1, 0)) %>%    mutate(p = ifelse(as.numeric(accn) <= 2.5 & sample == "p", 1, 0)) %>%    summarise(pe_sum = sum(s, p)) but, if want dataframe itself, then:
ea %>%   mutate(s = ifelse(as.numeric(accn) <= 2.5 & sample == "s", 1, 0)) %>%    mutate(p = ifelse(as.numeric(accn) <= 2.5 & sample == "p", 1, 0)) if don't care having distinct "p" , "s" column, can write more succinctly:
ea %>%    mutate(new = ifelse(as.numeric(accn) <= 2.5 & sample %in% c("s", "p"), 1, 0)) %>%    summarise(new_sum = sum(new)) or use have:
s<-(ea$accn<="2.5" & ea$sample=="s") p<-(ea$accn<="2.5" & ea$sample=="p") but, then:
sum(s, p) or:
s<-(ea$accn<="2.5" & ea$sample=="s") p<-(ea$accn<="2.5" & ea$sample=="p") pe<-cbind(s,p) but then:
sum(pe) # keeping object matrix, not spinning dataframe. to sum, each value 1 30 (optional), per question in comment section, answer be:
library(dplyr)  x <- 1:30 (sapply(x, function(x) {ifelse(as.numeric(ea$accn) <= x & ea$sample == "s", 1, 0)}) + sapply(x, function(x) {ifelse(as.numeric(ea$accn) <= x & ea$sample == "p", 1, 0)})) %>%     as.data.frame() %>%    summarise_all(sum) though don't know exact structure of output you're seeking.
Comments
Post a Comment