python - String values not converting to numeric values with replace() method -
i using .replace()
replace string values numeric values analysis. getting no errors, when inspect dataframe afterwards values remain unchanged. have tried using regex=true
, have had same problem. appreciated. print screen of notebook attached below , raw code follows.
df['international plan'].replace(['no', 'yes'], [0, 1], inplace = true) df['voice mail plan'].replace(['yes', 'no'], [1,0], inplace = true) df['churn'].replace(['false', 'true'], [0, 1], inplace = true)
micah
there problem white spaces in values:
np.random.seed(789) df = pd.dataframe({'international plan': np.random.choice([' yes',' no'], size=5), 'voice mail plan': np.random.choice([' yes',' no'], size=5), 'churn': np.random.choice([' false.',' true.'], size=5), 'area code': np.random.choice([415,408], size=5)}) print (df) area code churn international plan voice mail plan 0 408 true. no yes 1 415 false. yes yes 2 408 true. yes no 3 408 false. yes yes 4 408 false. no yes
solution apply
loop columns cols
, use str.strip
, series.replace
dict
:
cols = ['international plan','voice mail plan','churn'] d = {'no':0,'yes':1, 'true.':1, 'false.':0} df[cols] = df[cols].apply(lambda x: x.str.strip().replace(d)) print (df) area code churn international plan voice mail plan 0 408 1 0 1 1 415 0 1 1 2 408 1 1 0 3 408 0 1 1 4 408 0 0 1
or add whitespaces keys in dict
, use dataframe.replace
:
cols = ['international plan','voice mail plan','churn'] d = {' no':0,' yes':1, ' true.':1, ' false.':0} df[cols] = df[cols].replace(d)
and if want replace each column separately:
df['international plan'] = df['international plan'].str.strip().replace(['no','yes'],[0, 1]) df['voice mail plan'] = df['voice mail plan'].str.strip().replace(['yes','no'],[1,0]) df['churn'] = df['churn'].str.strip().replace(['false.','true.'],[0, 1]) print (df) area code churn international plan voice mail plan 0 408 1 0 1 1 415 0 1 1 2 408 1 1 0 3 408 0 1 1 4 408 0 0 1
Comments
Post a Comment