Search code examples
mysqlnumber-formattingbulk-load

LOAD Data text fields vs numeric comma interactions


I'm using the following to load a csv file to a mysql database:

LOAD DATA LOCAL INFILE 'file_location.csv' 
INTO TABLE social_spend 
FIELDS TERMINATED BY ',' 
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS

I have a text field that is written as: 'D,i,Y', which, after being uploaded shows in my database as the same:'D,i,Y'

I have a numeric field that is written as: '1,234' that is automatically trunacted on upload to: '1'

Why are text field commas preserved, but numeric fields are not?


Solution

  • When MySQL encounters a text string in a numeric context, it coerces it to a number. Sloppily.

    So if you use the string 00134abc in a numeric context, you get the number 134. And, in your case it sees 1,234 and coerces it to 1.

    Yeah, we all know about this being a pain in the xxx neck.

    You may be able to do your load with something like this (NOT debugged!)

    LOAD DATA LOCAL INFILE 'file_location.csv' 
    INTO TABLE social_spend 
    FIELDS TERMINATED BY ',' 
    OPTIONALLY ENCLOSED BY '"'
    LINES TERMINATED BY '\r\n'
    IGNORE 1 ROWS
    (col1, col2, @yourNumber, col4)
    SET numberColumn = CAST(REPLACE(@yourNumber, ',', '') AS SIGNED INTEGER);
    

    Between the parentheses you can list columns of your table to correspond to columns in your CSV. If you put @something you can use it in an expression in a SET clause. The expression I showed strips out the commas from your numbers.