python - What is an efficient way to generate the top N pandas numeric columns with highest frequency of a particular number? -


i trying top n numeric columns highest frequency of 1s (with other value being 0). understand easiest way sum on numeric columns , sort them, pythonic/efficient way achieve this?

sample following dataframe:

df

non-numericcol1 non-numericcol2   col1   col2   col3   ...   coln       abc             pqr            1      0       1           0       xyz             lmn            0      0       0           1       abc             lmn            0      1       1           0 

i wish achieve, let's say, top 3 column names.

example: d= {'col3': 2000, 'col10200': 1500, 'col4900': 1000}

i okay output being in other format (such pandas dataframe). there 10000 total columns 6000 rows.

try this:

in [113]: df out[113]:   non-numericcol1 non-numericcol2  col1  col2  col3  col4  coln 0             abc             pqr     1     0     1     0     0 1             xyz             lmn     0     0     0     0     1 2             abc             lmn     0     1     1     0     0  in [114]: df.select_dtypes(['number']).sum().nlargest(3) out[114]: col3    2 col1    1 col2    1 dtype: int64 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -