Search code examples
pandasexcelcsvformattingxlsxwriter

Taking data from CSV to XLSX via Pandas and XLSXwriter, error displaying numbers as text


Trying to take data from a CSV file to XLSX file in order to make charts out of it. Data is formatted in the following setup in the CSV file with channel names in the first row, units in the second row (time,pressure,boolean,empty) and data in the third row through X rows:

channel_name_1,channel_name_2,channel_name_3

unit_1,unit_2,unit_3

1,4,0.03

1,7,0.03

1,2,0.02

....

# reading the data in CSV with pandas by:
csv = file.csv
data = pd.read_csv(csv, sep=';')

# creating a workbook / worksheet with:
excel_file_path = r"file.xlsx"
workbook = xlsxwriter.Workbook(excel_file_path, {"nan_inf_to_errors": True})
data_worksheet = workbook.add_worksheet()
Data_chart = workbook.add_worksheet("Chart1")

# adding data to the sheet by:
for column_num, column_name in enumerate(data):
    data_worksheet.write(0,column_num,column_name)
    data_worksheet.write_column(1, column_num, data[column_name])

The above code will get all of the data into excel although the problem is the data (which are numbers) give a green triangle in the upper left corner of each cell in the excel file with the message "the number in this cell is formatted as text". Which makes them not plot-able because excel thinks it is text instead of numbers.

Also the units row (second row) is sometimes a word (sec), sometimes a boolean, sometimes empty so I cannot just format the whole column of data in float/number.

Anyone else run into this problem?


Solution

  • Since you already using a XlsxWriter constructor parameter you could add the strings_to_numbers option. This will give you numeric values in Excel:

    workbook = xlsxwriter.Workbook(
        excel_file_path, {"nan_inf_to_errors": True, "strings_to_numbers": True}
    )
    

    Output:

    enter image description here