Search code examples
mysqlload-data-infile

Is there a work-around that allows missing data to equal NULL for LOAD DATA INFILE in MySQL?


I have a lot of large csv files with NULL values stored as ,, (i.e., no entry). Using LOAD DATA INFILE makes these NULL values into zeros, even if I create the table with a string like var DOUBLE DEFAULT NULL. After a lot of searching I found that this is a known "bug", although it may be a feature for some users. Is there a way that I can fix this on the fly without pre-processing? These data are all numeric, so a zero value is very different from NULL.

Or if I have to do pre-processing, is there one that is most promising for dealing with tens of csv files of 100mb to 1gb? Thanks!


Solution

  • With minimal preprocessing with sed, you can have your data ready for import.

    for csvfile in *.csv
    do
        sed -i -e 's/^,/\\N,/' -e 's/,$/,\\N/' -e 's/,,/,\\N,/g' -e 's/,,/,\\N,/g' $csvfile
    done
    

    That should do an in-place edit of your CSV files and replace the blank values with \N. Update the glob, *.csv, to match your needs.

    The reason there are two identical regular expressions matching ,, is because I couldn't figure out another way to make it replace two consecutive blank values. E.g. ,,,.