I was reading an xlm file using pandas.read_html
and works almost perfect, the problem is that the file has commas as decimal separators instead of dots (the default in read_html
).
I could easily replace the commas by dots in one file, but i have almost 200 files with that configuration.
with pandas.read_csv
you can define the decimal separator, but i don't know why in pandas.read_html
you can only define the thousand separator.
any guidance in this matter?, there is another way to automate the comma/dot replacement before it is open by pandas? thanks in advance!
Thanks @zhqiat. I think upgrading pandas
to version 0.19
will solve the problem. unfortunately I couldn't found an easy way to accomplish that. I found a tutorial to upgrade Pandas but for ubuntu (winXP user).
I finally chose the workaround, using the method posted here, basically converting all columns, one by one, to a numeric type of pandas.Series
result[col] = result[col].apply(lambda x: x.str.replace(".","").str.replace(",","."))
I know that this solution ain't the best, but works. Thanks