python - Filtering a pandas series using a lambda expression that operates on individual elements -


i'm trying using pandas chaining map , filter operations. i've come across several options, partly outlined in here: pandas how filter series

to summarize,

s = series(range(10))  s.where(s > 4).dropna() s.where(lambda x: x > 4).dropna()  s.loc[s > 4] s.loc[lambda x: x > 4]  s.to_frame(name='x').query("x > 4") 

this fine numerical comparisons , equality checks, doesn't work predicates involving other operations. simple example, consider matching against first character of string.

s = series(['aa', 'ab', 'ba']) s.loc[lambda x: x.startswith('a')] # fails 

this fails message "series has no attribute 'startswith'" since argument x passed lambda expression in second line series itself, rather individual elements contains.

interestingly map allow element-wise access:

series(list('abcd')).map(lambda x: x.upper()) # results in ['a', 'b', 'c', 'd'] though series has no upper method 

while there's clever ways handle startswith example, i'm hoping find more general solution series can filtered using function accepts individual values collection. , ideally allow chaining operations in,

s = (series(...)         .map(...)         .where(...)         .map(...)) 

is supported in pandas?

update: scott provided answer cases value string, can handled series.str described in answer.

but cases series containing objects? there way access attributes or apply functions them?

i guess standard way of managing case de-structure the relevant fields of object data frame, each attribute column. though there might cases want transform collection of objects map , filter(loc/where), without having disassemble complex type dataframe convert back.

i'm partly trying find alternative standard map()/filter() functions in python, operations have nested in reverse.

ie,

map(function3, filter(function2, map(function1, collection))) 

use .str string accessor needed pandas series , string operations.

s = series(['aa', 'ab', 'ba']) s.loc[lambda x: x.str.startswith('a')] 

when using map, apply string function each element therefore don't need string accessor.

and @pirsquared's point in comments, don't need lambda @ all, can use boolean indexing.

s = pd.series(['aa', 'ab', 'ba'])  s.loc[s.str.startswith('a')] 

s.str.startswith returns true false boolean series when placed in backets series returns values align true.


Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -