Search code examples
mysqlcsvfile-uploadnullload-data-infile

MySQL: Imported CSV file converts Nulls to 0


I have checked some of the older discussions regarding this mater and it does not seem to fix the issue for me. The only solution avaiable was here: MySQL load NULL values from CSV data but I have way more variables than suggested in this case.

I am using

mysql  Ver 8.0.32-0ubuntu0.22.04.2 for Linux on x86_64 ((Ubuntu))

I wanted to check with you any other way to upload my csv file and keep the NULL values as NULL and not converted automatically to 0.

TABLE creation:

CREATE TABLE CovidDeaths
(iso_code   varchar(100)    ,
continent   varchar(100)    ,
location    varchar(100)    ,
date    varchar(100)    ,
total_cases int ,
new_cases   int ,
new_cases_smoothed  int ,
total_deaths int,
new_deaths  int ,
new_deaths_smoothed int ,
total_cases_per_million FLOAT   ,
new_cases_per_million   FLOAT   ,
new_cases_smoothed_per_million  FLOAT   ,
total_deaths_per_million    FLOAT   ,
new_deaths_per_million  FLOAT   ,
new_deaths_smoothed_per_million FLOAT   ,
reproduction_rate   FLOAT   ,
icu_patients    int ,
icu_patients_per_million    FLOAT   ,
hosp_patients   int ,
hosp_patients_per_million   FLOAT   ,
weekly_icu_admissions   int ,
weekly_icu_admissions_per_million   FLOAT   ,
weekly_hosp_admissions  int ,
weekly_hosp_admissions_per_million  FLOAT   ,
new_tests   int ,
total_tests int ,
total_tests_per_thousand    FLOAT   ,
new_tests_per_thousand  FLOAT   ,
new_tests_smoothed  int ,
new_tests_smoothed_per_thousand FLOAT   ,
positive_rate   FLOAT   ,
tests_per_case  FLOAT   ,
tests_units int ,
total_vaccinations  int ,
people_vaccinated   int ,
people_fully_vaccinated int ,
new_vaccinations    int ,
new_vaccinations_smoothed   int ,
total_vaccinations_per_hundred  FLOAT   ,
people_vaccinated_per_hundred   FLOAT   ,
people_fully_vaccinated_per_hundred FLOAT   ,
new_vaccinations_smoothed_per_million   FLOAT   ,
stringency_index    int ,
population  int ,
population_density  FLOAT   ,
median_age  FLOAT   ,
aged_65_older   int ,
aged_70_older   int ,
gdp_per_capita  FLOAT   ,
extreme_poverty int ,
cardiovasc_death_rate   FLOAT   ,
diabetes_prevalence FLOAT   ,
female_smokers  FLOAT   ,
male_smokers    FLOAT   ,
handwashing_facilities  int ,
hospital_beds_per_thousand  FLOAT   ,
life_expectancy FLOAT   ,
human_development_index FLOAT);

File Upload:

LOAD DATA LOCAL INFILE '/home/gyahia/Desktop/BootCamp/CovidVaccinations.csv'
INTO TABLE CovidVaccinations
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
-- OPTIONALLY ENCLOSED BY ''
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

Result of the Select Statement

enter image description here

Thank you for your help.


Solution

  • You have two options:

    • Edit the CSV input file to change values that should be NULL to \N. That escape sequence is loaded as NULL by LOAD DATA INFILE.

    • Use the NULLIF() function to transform '' to NULL as described in the other answer you linked to. You have more variables than that answer showed, but you only have 59, right? It will take longer to type these, but it will work.


    If you don't like to type that much, you can generate the code from a list of column names.

    You can get the list in a query like this:

    mysql> select column_name 
     from information_schema.columns 
     where table_name ='CovidDeaths';
    

    You can combine them with GROUP_CONCAT() to produce the right lines of code for your LOAD DATA INFILE:

    select concat(
     'LOAD DATA LOCAL INFILE ''/home/gyahia/Desktop/BootCamp/CovidVaccinations.csv''
      INTO TABLE CovidVaccinations 
      FIELDS TERMINATED BY '','' 
      ENCLOSED BY ''"'' 
      LINES TERMINATED BY ''\\n'' (',
    group_concat(concat('@', column_name)),
    ') 
    SET ',
    group_concat(concat(column_name, '=NULLIF(@', column_name, ', '''')')),
    ' IGNORE 1 ROWS;') as _load_data_stmt
    from information_schema.columns where table_name='CovidDeaths'