Search code examples
mysqlload-data-infile

MySql LOAD DATA INFILE failing with commas


Here's the statement:

LOAD DATA INFILE '/var/lib/mysql-files/test.csv'
INTO TABLE test 
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '\\' 

Here's a line that imports fine:

1450327840667357185, 1, "This is my text"

If the text field has a comma though, it fails. For example, see this line:

1450327840667357185, 1, "This is my text, with a comma in it"

Why does it fail?

If I escape the comma (by putting a backslash in front of it) it works fine. This does not make sense though. I have stipulated that the fields may be enclosed by double quotes, so why doesn't it accept all the content therein as a content of the field?

If I have to reprocess all the text to quote comma's, it would be a big task.


Solution

  • Your CSV has spaces after the , separator.

    I don't have MySQL to play with, but I expect that means it's seeing those spaces as part of the values, causing all sorts of problems.

    This means that the CSV line is read as...

    • VAL = 1450327840667357185
    • SEP = ,
    • VAL = 1
    • SEP = ,
    • VAL = "This is my text
    • SEP = ,
    • VAL = with a comma in it"

    Note that the leading spaces are included in the values.

    This means that the third values does not start with " (it starts with a space), so the ENCLOSED BY '"' isn't doing anything. This means that the parser treats the string as unquoted, which in turn means that the subsequent , is seen as a separator.


    Try TERMINATED BY ', '...

    This will mean that the the parser now reads the line as...

    • VAL = 1450327840667357185
    • SEP = ,
    • VAL = 1
    • SEP = ,
    • VAL = "This is my text, with a comma in it"

    Note that the third value now does start with " (because the leading space is now part of the separator), and so is a quoted string and so the subsequent , is not treated as a separator.