python - What are the rules for pandas' elementwise binary boolean operands with same length elements holding differing indexes? -
i have been applying binary boolean operators code base , came across bug surprised me. i've reconstructed minimal working example demonstrate behavior below...
import pandas s = pandas.series( [true]*4 ) d = pandas.dataframe( { 'a':[true, false, true, false] , 'b':[true]*4 } ) print(d) b 0 true true 1 false true 2 true true 3 false true print( s[0:2] ) 0 true 1 true dtype: bool print( d.loc[ d['a'] , 'b' ] ) 0 true 2 true dtype: bool print( s[0:2] & d.loc[ d['a'] , 'b' ] ) 0 true 1 false 2 false
this last statement's value catches me entirely surprise in yielding of 3 elements. realizing influence of indices here manually reset index yield result expected.
s[0:2].reset_index(drop=true) & d.loc[ d['a'] , 'b' ].reset_index( drop=true ) 0 true 1 true
needless i'll need revisit documentation , grip understand how indexing rules apply here. can 1 explain step step how operator behaves mixed indexes?
=============================================
just add comparison coming similar r background, r's data.frame
equivalent operation yields i'd expect...
> = c(true,false,true,false) > b = c(true,true,true,true) > > d = data.frame( a, b ) > d b 1 true true 2 false true 3 true true 4 false true > s = c( true,true,true,true) > s [1] true true true true > > d[ d$a , 'b'] [1] true true > > s[0:2] [1] true true > s[0:2] & d[ d$a , 'b'] [1] true true
you comparing 2 series different indices
s[0:2] 0 true 1 true dtype: bool
and
d.loc[ d['a'] , 'b'] 0 true 2 true dtype: bool
pandas
needs align indices compares.
s[0:2] & d.loc[ d['a'] , 'b'] 0 true # true both indices therefore true 1 false # true s[0:2] , missing other therefore false 2 false # true d , missing other therefore false dtype: bool
Comments
Post a Comment