python - Copying a Single Row From One Pandas Dataframe to Another Results in Missing Values -

September 15, 2012

i trying append values single row in 1 pandas dataframe another. 2 dataframes have same number of rows, did not expect cause issues. however, while throws no errors, output problematic.

it results in last 2 rows of appended columns being nan values, , 1 of values in row being omitted in process.

here first dataframe `ds1':

+----+-----------+-------+-----------+------------+--------------------+ |    | unique id |  zip  |  revenue  | population | revenue_per_person | +----+-----------+-------+-----------+------------+--------------------+ |  1 |       179 | 75208 |  67789037 |      30171 |     2246.827649067 | |  2 |       186 | 75208 |  62488032 |      30171 |    2071.1289649001 | |  3 |       180 | 75212 | 107230739 |      24884 |    4309.2243610352 | |  4 |       182 | 75212 |  81768596 |      24884 |    3285.9908374859 | |  5 |       181 | 75137 |  93296769 |      18861 |    4946.5441386989 | |  6 |       183 | 75237 |  79177044 |      17101 |    4629.9657329981 | |  7 |       187 | 75237 |  60000000 |      17101 |    3508.5667504824 | |  9 |       185 | 75236 |  76489996 |      15949 |    4795.9117186031 | | 10 |       189 | 75236 |  55203335 |      15949 |    3461.2411436454 | | 11 |       188 | 75115 |  57451134 |      48877 |     1175.422673241 | +----+-----------+-------+-----------+------------+--------------------+

and second, `ds2':

+---+-----------+-------+---------+ |   |     0     |   1   | cluster | +---+-----------+-------+---------+ | 0 |  67789037 | 30171 |       1 | | 1 |  62488032 | 30171 |       1 | | 2 | 107230739 | 24884 |       0 | | 3 |  81768596 | 24884 |       0 | | 4 |  93296769 | 18861 |       0 | | 5 |  79177044 | 17101 |       0 | | 6 |  60000000 | 17101 |       1 | | 7 |  76489996 | 15949 |       0 | | 8 |  55203335 | 15949 |       1 | | 9 |  57451134 | 48877 |       2 | +---+-----------+-------+---------+

here original code:

ds1['type'] = ds2['cluster']

when check values of ds1 after running above line, following values in ds1 dataframe.

+----+-----------+-------+--------------------+------------+--------------------+------+ |    | unique id | zip   | revenue            | population | revenue_per_person | type | +----+-----------+-------+--------------------+------------+--------------------+------+ | 1  | 179       | 75208 | 67789037.0         | 30171      | 2246.827649066985  | 1.0  | | 2  | 186       | 75208 | 62488032.0         | 30171      | 2071.1289649000696 | 0.0  | | 3  | 180       | 75212 | 107230738.99999999 | 24884      | 4309.2243610352025 | 0.0  | | 4  | 182       | 75212 | 81768596.0         | 24884      | 3285.9908374859347 | 0.0  | | 5  | 181       | 75137 | 93296769.0         | 18861      | 4946.544138698902  | 0.0  | | 6  | 183       | 75237 | 79177044.0         | 17101      | 4629.96573299807   | 1.0  | | 7  | 187       | 75237 | 60000000.0         | 17101      | 3508.566750482428  | 0.0  | | 9  | 185       | 75236 | 76489995.99999999  | 15949      | 4795.911718603046  | 2.0  | | 10 | 189       | 75236 | 55203334.99999999  | 15949      | 3461.241143645369  | nan  | | 11 | 188       | 75115 | 57451133.99999999  | 48877      | 1175.4226732409925 | nan  | +----+-----------+-------+--------------------+------------+--------------------+------+

it's interesting note, code throw following warning:

a value trying set on copy of slice dataframe. try using .loc[row_indexer,col_indexer] = value instead  see caveats in documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

so tried alternative approach:

ds1['type'] = ds2.loc[:,'cluster']

which produces same warning , same dataframe outcome, single missing value , 2 nan values @ end.

this due index mis-alignment. notice ds1 has index values of 10 , 11 , assigning new column ds1 series without indices. results in missing values 2 indices.

assign values right side column on left bypass alignment issue.

ds1['type'] = ds2['cluster'].values

if index meaningless you, reset_index ahead of time

ds1.reset_index(drop=true, inplace=true) ds2.reset_index(drop=true, inplace=true)  ds1['type'] = ds2['cluster']

Search This Blog

RT

python - Copying a Single Row From One Pandas Dataframe to Another Results in Missing Values -

Comments

Post a Comment

Popular posts from this blog

javascript - Replicate keyboard event with html button -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

Ansible warning on jinja2 braces on when -