Search code examples
pythonexcelpandasimportline-breaks

Missing line breaks in cells after importing Excel spreadsheet into Pandas DataFrame


I'm using Python Pandas to import an Excel spreadsheet (xlsx) into a dataframe. The Excel spreadsheet contains text in some columns. An example cell with multiline text is given below:

document belongs to family 13: 
claims are important. 
description discloses a bycicle with 3 wheels.
assigne is well known Trudi Bikes Inc.

The cells contain line breaks. The example has 4 lines. The problem is that the line breaks get lost when importing the spread into the dataframe. All lines are merged into one line.

As far as I know Excel adds line breaks with a special character CHR(10) or the like.

My question therefore: is there a way of keeping this special characters when importing a spreadsheet (or the line break) into a pandas data frame? I wouldn't mind having something like escape sequence in the string of the dataframe cell.

Regards,

Vinz


Solution

  • In this way I keep the \n in the df:

     df = pd.read_excel(yourfile, 'Sheet1')
    

    enter image description here

    Output:

                     col1     col2      col3
    0  hello\ncome in\ngo    hello        hi
    1                 bye  bye bye  bye\nbye