Is there a way to convert values like '34%' directly to int or float when using read_csv()
command in pandas? I want '34%' to be directly read as 0.34
Using this in read_csv
did not work:
read_csv(..., dtype={'col':np.float})
After loading the csv as 'df' this also did not work with the error "invalid literal for float(): 34%"
df['col'] = df['col'].astype(float)
I ended up using this which works but is long winded:
df['col'] = df['col'].apply(lambda x: np.nan if x in ['-'] else x[:-1]).astype(float)/100
You can define a custom function to convert your percents to floats at read_csv()
time:
# dummy data
temp1 = """index col
113 34%
122 50%
123 32%
301 12%"""
# Custom function taken from https://stackoverflow.com/questions/12432663/what-is-a-clean-way-to-convert-a-string-percent-to-a-float
def p2f(x):
return float(x.strip('%'))/100
# Pass to `converters` param as a dict...
df = pd.read_csv(io.StringIO(temp1), sep='\s+',index_col=[0], converters={'col':p2f})
df
col
index
113 0.34
122 0.50
123 0.32
301 0.12
# Check that dtypes really are floats
df.dtypes
col float64
dtype: object
My percent to float code is courtesy of ashwini's answer: What is a clean way to convert a string percent to a float?