Search code examples
vertica

how to create table definition from csv file and also copy data at the same time


I want to load data from a csv file into Vertica. I don't want to create table and the copy data in two separate steps. Instead, I want to create the table, specify the csv file and then let vertica figure out column definitions (names, data type) itself and then load the data.

Something like create table titanic_train () as COPY FROM '/data/train.csv' PARSER fcsvparser() rejected data as table titanic_train_rejected abort on error no commit;

Is it possible?

I guess that if a table has 100s of columns then automating the create table, column definition and data copy would be much easier/faster than doing these steps separately


Solution

  • It's always several steps, no matter what.

    1. Use the built-in bits of Vertica:
    CREATE FLEX TABLE foo();
    COPY foo FROM '/data/mycsvs/foo.csv' PARSER fCsvParser();
    SELECT COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW('foo');
    -- THEN, either:
    SELECT * FROM foo_view;
    -- OR: create a ROS Table:
    CREATE TABLE foo_ros AS SELECT * FROM foo_view;
    
    1. Get a CSV-to-DDL parser from the net, like https://github.com/marco-the-sane/d2l, and install it then:
    $ d2l -coldelcomma -chardelquote -drp -copy /data/mycsvs/foo.csv | vsql
    

    So , in the second instance, it's one step, but it calls both d2l and vsql.