Search code examples
pythonpandasdata-cleaning

Cleaning data more efficiently in Pandas


I have a python script that pulls EPS information from streetinsider.com. Currently I'm cleaning the data using an entirely inefficient method as seen below. Wondering if someone can show how this can be done more efficiently.

The following example is very very scaled down, there are many more columns and many many more rows.

eps_table = DataFrame({'% Beat': '+1,405%', '% Week': '+123%'}, index=[0])

things_to_remove = ['% Beat', '% Week']
for i in things_to_remove:
    eps_table[i] = eps_table[i].replace("%", "",regex=True)
    eps_table[i] = eps_table[i].replace("\+", "", regex=True)
    eps_table[i] = eps_table[i].replace("\,", "", regex=True)

Thanks.


Solution

  • Do it all at once:

    eps_table.replace(r'[%+,]', '', regex=True)