Search code examples
excelcsv

Excel cannot import a CSV file containing fields with multiple lines


Microsoft Excel will write a CSV file containing fields with multiple lines. The newlines are 0A (UNIX-style) instead of 0D0A.

However, it will not read correctly the .csv file it just wrote. The fields that contain 0A newlines, become new rows. How can this be overcome?

This Excel spreadsheet is saved as a CSV file named t-xl.xlsx and t-xl.csv.

enter image description here

PS H:\r> Format-Hex .\t-xl.csv

           Path: H:\r\t-xl.csv

           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000   66 31 2C 66 32 0D 0A 31 2C 66 6F 72 0D 0A 32 2C  f1,f2..1,for..2,
00000010   22 6E 6F 77 0A 69 73 0A 74 68 65 22 0D 0A 33 2C  "now.is.the"..3,
00000020   74 69 6D 65 0D 0A                                time..

When the t-xl.csv is loaded, Excel seems to remember and handle the newlines correctly (as it was in t-xl.xlsx).

However, when using Data > From Text, it will not handle the newlines correctly.

Failed .csv import

At least one CSV reference describes support for field embedded newlines. Is there any reason Microsoft Excel does not support this?http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm


Solution

  • The Excel legacy text import wizard does not respect quoted line-breaks as not splitting the field.

    Opening the file directly with Excel, as you have seen, will respect the quoted line-breaks.

    If you have Excel 2010+, you can use Power Query to Get & Transform from Text/CSV. There is an option to enable this (I believe it is enabled by default).

    The only work-around for the legacy wizard of which I am aware would be to pre-process the file replacing the quoted line-break with something else, and then processing it again after import to replace "something else" with the line-break.