Collapse columns in a dataframe (R) -


basically, have dataframe, df

                  beginning1 protein2    protein3    protein4    biomarker1       pathway3            g           na           na           f       pathway8    z         g           na           na           e       pathway9            g           z            h            f       pathway6    y         g           z            h            e       pathway2            g           d            na           f       pathway5    q         g           d            na           e       pathway1            d           k            na           f       pathway7            b           c            d            f       pathway4    v         b           c            d            e 

and want combine dataframe rows when identical "protein2" "protein4" condense, giving following:

            beginning1 protein2     protein3     protein4     biomarker1 pathway3    a,z         g           na           na           f,e pathway9    a,y         g           z            h            f,e pathway2    a,q         g           d            na           f,e pathway1              d           k            na           f pathway7    a,v         b           c            d            f,e 

this similar question asked before (consolidating duplicate rows in dataframe), difference consolidating "beginning1" row.

so far, have tried:

library(dat.table) dat<-data.table(df)  total_collapse <- dat[, .( biomarker1 = paste0(biomarker1, collapse = ", ")), = .(beginning1, protein1, protein2, protein3)]  total_collapse <- dat[, .( beginning1 = paste0(beginning1, collapse = ", ")), = .(protein1, protein2, protein3)] 

which gives output:

            beginning1  protein2    protein3      protein4      biomarker1 pathway3    g           na           na           f,e pathway9    g           z            h            f,e pathway2    g           d            na           f,e pathway1    d           k            na           f pathway7    b           c            d            f,e 

does know how fix problem? have tried duplicating solution collapse / concatenate / aggregate column single comma separated string within each group, have had no success.

i sorry if simple error- pretty new r.

here's possible solution using dplyr

df %>% group_by_at(vars(protein2:protein4)) %>%   summarize_all(paste, collapse=",") 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -