r - Avoiding intermediate dlply step when starting with a dataframe and ending with a dataframe -
i using plyr perform bootstrapping function on subsets of dataset.
because boot function creates list object, using dlply store output of function, ddply parts of bootfunction want out
my example dataset follows:
dat = data.frame(x = rnorm(10, sd = 1),y = rnorm(10, sd = 1),z = rep(c("sppa", "sppb", "sppc", "sppd", "sppe"), 2),u = rep(c("sitea", "siteb"), 5))
the exact function isn't terribly important, sake of reproducibility, here function i'm using:
boot_fun = function(x,i) { = sample(1:24, 24, replace = true) ts1 = mean(x[i,"x"]) ts2 = sample(x[i,"y"]) mean(ts1) - mean(ts2) }
my plyr function following:
temp = dlply(dat, c("z", "u"), summarise, boot_object = boot(dat, boot_fun, r = 1000))
since want out of boot object mean , ci, perform following plyr function:
temp2 = ldply(temp, summarise, mean = mean(boot$t), lowci = quantile(boot$t, 0.025), highci = quantile(boot$t, 0.975))
this works , accomplishes want (although error subsetting doesn't seem affect care about), feel there should way skip intermediate dlply step.
-edit- clarify on i'm trying if didn't need splitting groups
if manually splitting instead of using plyr, following:
temp = boot(dat[dat$z == "sppa" & dat$u == "sitea",], boot_fun, r = 1000) temp2$mean = mean(temp$t) temp2$lowci = quantile(temp$t, 0.025) temp2$highci = quantile(temp$t, 0.975)
if didn't care groups @ , wanted whole group like
temp = boot(dat, boot_fun, r = 1000) temp2$mean = mean(temp$t) temp2$lowci = quantile(temp$t, 0.025) temp2$highci = quantile(temp$t, 0.975)
your example not reproducible me.
when temp = boot(dat, boot_fun, r = 1000)
, warning
:
ordinary nonparametric bootstrap call: boot(data = dat, statistic = boot_fun, r = 1000) bootstrap statistics : warning: values of t1* na
i think current code pretty efficient, if you're looking other possibilities, try tidyverse
1) group_by
relevant columns, 2) nest
relevant data bootstrapping, 3) run bootstrap nested data, 4) isolate statistics desire, 5) return normal data frame
library(boot) library(tidyverse) dat1 <- dat %>% group_by(z,u) %>% nest() %>% mutate(data=map(data,~boot(.x, boot_fun, r=1000))) %>% mutate(data=map(data,~data.frame(mean=mean(.x$t), lowci=quantile(.x$t, 0.025), highci=quantile(.x$t,0.975)))) %>% unnest(data)
Comments
Post a Comment