python - String values not converting to numeric values with replace() method -


i using .replace() replace string values numeric values analysis. getting no errors, when inspect dataframe afterwards values remain unchanged. have tried using regex=true , have had same problem. appreciated. print screen of notebook attached below , raw code follows.

df['international plan'].replace(['no', 'yes'], [0, 1], inplace = true)  df['voice mail plan'].replace(['yes', 'no'], [1,0], inplace = true)  df['churn'].replace(['false', 'true'], [0, 1], inplace = true) 

print screen jupyter notebook

micah

there problem white spaces in values:

np.random.seed(789) df = pd.dataframe({'international plan': np.random.choice([' yes',' no'], size=5),                   'voice mail plan': np.random.choice([' yes',' no'], size=5),                   'churn': np.random.choice([' false.',' true.'], size=5),                   'area code': np.random.choice([415,408], size=5)}) print (df)    area code    churn international plan voice mail plan 0        408    true.                 no             yes 1        415   false.                yes             yes 2        408    true.                yes              no 3        408   false.                yes             yes 4        408   false.                 no             yes 

solution apply loop columns cols , use str.strip , series.replace dict:

cols = ['international plan','voice mail plan','churn'] d = {'no':0,'yes':1, 'true.':1, 'false.':0} df[cols] = df[cols].apply(lambda x: x.str.strip().replace(d)) print (df)    area code  churn  international plan  voice mail plan 0        408      1                   0                1 1        415      0                   1                1 2        408      1                   1                0 3        408      0                   1                1 4        408      0                   0                1 

or add whitespaces keys in dict, use dataframe.replace:

cols = ['international plan','voice mail plan','churn'] d = {' no':0,' yes':1, ' true.':1, ' false.':0} df[cols] = df[cols].replace(d) 

and if want replace each column separately:

df['international plan'] = df['international plan'].str.strip().replace(['no','yes'],[0, 1]) df['voice mail plan'] = df['voice mail plan'].str.strip().replace(['yes','no'],[1,0]) df['churn'] = df['churn'].str.strip().replace(['false.','true.'],[0, 1]) print (df)    area code  churn  international plan  voice mail plan 0        408      1                   0                1 1        415      0                   1                1 2        408      1                   1                0 3        408      0                   1                1 4        408      0                   0                1 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -