Search code examples
pythonpandasexport-to-csv

Pandas.to_csv Causing Floats to Appear


I have created a program that takes in Excel sheets, modifies them with Pandas, and saves them to a CSV.

When I inspect the output CSV the columns are seen as ints, however when I upload to Oracle the columns become floats.

Some columns in the input excel are things such as "3" "4" "1" but other columns contain actual floats for monetary values. For this reason I can't cast onto the entire dataframe an as_type(int).

The excel sheets number in the hundreds, and have differing column names each month. So a columnwise application of as_type(int) will not work.

Code sample:

import pandas as pd
output_location = save_location + '.csv'

df_manipulation = pd.read_excel(filepath, index_col=None)
df_manipulation.to_excel(output_location, index = False)

Any ideas on how to keep the columns in question as ints?


Solution

  • Reading the documentation for read_excel(), specifically for the convert_float parameter, it mentions that Excel stores all numbers as floats internally.

    If you want to convert the columns from float to int manually you can do:

    df_manipulation['column_name'] = df_manipulation['column_name'].astype('int')
    

    Alternatively, you can load the entire sheet with a specific datatype such as strings with:

    df_manipulation = pd.read_excel(filepath, index_col=None, dtype = str)