python - Find column name in pandas that matches an array -


i have large dataframe (5000 x 12039) , want column name matches numpy array.

for example, if have table

        m1lenhr m1lenmin    m1citywt    m1a12a  cm1age  cm1numb m1b1a   m1b1b   m1b12a  m1b12b  ... kind_attention_scale_10 kind_attention_scale_22 kind_attention_scale_21 kind_attention_scale_15 kind_attention_scale_18 kind_attention_scale_19 kind_attention_scale_25 kind_attention_scale_24 kind_attention_scale_27 kind_attention_scale_23 challengeid                                                                                  1   0.130765    40.0    202.485367  1.893256    27.0    1.0 2.0 0.0 2.254198    2.289966    ... 0   0   0   0   0   0   0   0   0   0 2   0.000000    40.0    45.608219   1.000000    24.0    1.0 2.0 0.0 2.000000    3.000000    ... 0   0   0   0   0   0   0   0   0   0 3   0.000000    35.0    39.060299   2.000000    23.0    1.0 2.0 0.0 2.254198    2.289966    ... 0   0   0   0   0   0   0   0   0   0 4   0.000000    30.0    22.304855   1.893256    22.0    1.0 3.0 0.0 2.000000    3.000000    ... 0   0   0   0   0   0   0   0   0   0 5   0.000000    25.0    35.518272   1.893256    19.0    1.0 1.0 6.0 1.000000    3.000000    ... 0 

i want this:

x = [40.0, 40.0, 35.0, 30.0, 25.0] find_column(x) 

and have find_column(x) return m1lenmin

approach #1

here's 1 vectorized approach leveraging numpy broadcasting -

df.columns[(df.values == np.asarray(x)[:,none]).all(0)] 

sample run -

in [367]: df out[367]:     0  1  2  3  4  5  6  7  8  9 0  7  1  2  6  2  1  7  2  0  6 1  5  4  3  3  2  1  1  1  5  5 2  7  7  2  2  5  4  6  6  5  7 3  0  5  4  1  5  7  8  2  2  4 4  7  1  0  4  5  4  3  2  8  6  in [368]: x = df.iloc[:,2].values.tolist()  in [369]: x out[369]: [2, 3, 2, 4, 0]  in [370]: df.columns[(df.values == np.asarray(x)[:,none]).all(0)] out[370]: int64index([2], dtype='int64') 

approach #2

alternatively, here's using concept of views -

def view1d(a, b): # a, b arrays     = np.ascontiguousarray(a)     void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))     return a.view(void_dt).ravel(),  b.view(void_dt).ravel()  df1d_arr, x1d = view1d(df.values.t,np.asarray(x)[none]) out = np.flatnonzero(df1d_arr==x1d) 

sample run -

in [442]: df out[442]:     0  1  2  3  4  5  6  7  8  9 0  7  1  2  6  2  1  7  2  0  6 1  5  4  3  3  2  1  1  1  5  5 2  7  7  2  2  5  4  6  6  5  7 3  0  5  4  1  5  7  8  2  2  4 4  7  1  0  4  5  4  3  2  8  6  in [443]: x = df.iloc[:,5].values.tolist()  in [444]: df1d_arr, x1d = view1d(df.values.t,np.asarray(x)[none])  in [445]: np.flatnonzero(df1d_arr==x1d) out[445]: array([5]) 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -