Search code examples
mysqllinuxbashmysqlimport

Use mysqlimport with a subset of a file's columns


I have a file, say data.csv, which is coming into my server, and which I want to import to the data table in a MySQL database using mysqlimport.

My issue is, this file has a lot more data than I actually want to import. It is used by another server before me which needs all this data, but I only need a handful of columns from it.

Is there a way to specify what columns I want mysqlimport to use?

I thought it would look something like

mysqlimport --columns="field_1","field_2","field_42","field_31","field_16","field_4" db data.csv

but the table just contains the first 6 fields of the file instead. If needs be, I can rearrange the table so that the fields I want are in order (i.e., I'd be running --columns="field_1","field_2","field_4","field_16",...).

I'm vaguely aware that this could be done using sed, but I'm curious whether mysqlimport natively supports it?

EDIT: The accepted answer isn't exactly what I was hoping for, but I think the answer is, "no, it sadly doesn't support it". In any case, check out the accepted answer and comments for workarounds!


Solution

  • The --columns option doesn't name the columns of the input file. It names the columns of the destination table you are importing into. All the fields of the input file need to go somewhere. You need to have as many columns in --columns as the fields of the input file.

    But not all the fields of the input file need to go into columns of your table.

    Huh?

    Here's a trick (this works in LOAD DATA INFILE too): You can "send" the fields of the input to either a real column of your table, or to a session variable. This variable will just get overwritten by every row input from your file, but that doesn't matter. Think of it like a dummy variable in a code project.

    Example: I created a table with three columns.

    mysql> create table mytable (one text, two text, four text);
    

    I created an input text file with four fields.

    $ cat mytable.csv
    one,two,three,four
    

    Here I import the fields into their respective columns, skipping field three by putting it into a dummy variable.

    $ mysqlimport --local --columns one,two,@dummy,four --fields-terminated-by=, test mytable.csv 
    

    Success!

    mysql> select * from mytable;
    +------+------+------+
    | one  | two  | four |
    +------+------+------+
    | one  | two  | four |
    +------+------+------+