r - aggregate multiple rows in dataframe -
i have dataframe looks this:
id=c(3, 3, 4, 5, 5) a_2015 =c("abc", na, na, "abc", na) a_2016 = c("na", "def", "abc", na, "abc") df = data.frame(id, a_2015, a_2016) df id a_2015 a_2016 1 3 abc na 2 3 na def 3 4 na abc 4 5 abc na 5 5 na abc
that means if in column a_2015 entry there na in a_2016 or viceversa. can never have in same row valid entry in both columns a_2015 , a_2016.
i aggregate dataframe like
id a_2015 a_2016 3 abc def 4 na abc 5 abc abc
i tried solve aggregate think need apply, or? thankful hints!
you can use dplyr
well:
library(tidyverse) df %>% group_by(id) %>% summarise(tmp=paste(a_2015, a_2016, collapse = "")) %>% mutate(tmp=gsub("nana ", "", tmp)) %>% separate(tmp, = c("a_2015", "a_2016"), sep = " ") # tibble: 3 x 3 id a_2015 a_2016 * <dbl> <chr> <chr> 1 3 abc def 2 4 na abc 3 5 abc abc
or base r:
aggregate(df[,-1], list(df$id), function(x) gsub("na", "", paste0(x, collapse = ""))) group.1 a_2015 a_2016 1 3 abc def 2 4 abc 3 5 abc abc
then have replace ""
na
, edit colnames.
Comments
Post a Comment