Search code examples
unixsedsql-loader

How do I load a column using sqlloader if the CSV column is empty except for the header?


I would like to load a CSV file using sqlloader without having to edit the CSV file each time. When the file is sent to me, the data in the first column is only included in a header above the rest of that data. First code snippet is an example of what I'm receiving with fake data, second code snippet is what I would like to change it to.

Each row in the SQL table needs to include the number from row one.

1,Intern,
,1/8/2023,Bob
,5/3/2022,Alice
,7/25/2022,Charles
2,Assistant,
,1/8/2023,Heather
,5/3/2022,Harold
,7/25/2022,Dave
3,Manager,
,1/1/2023,Tim
,1/8/2023,Lyon
,5/3/2022,Greg
,7/25/2022,Tyler
5,Head Manager,
,1/8/2023,Charles
,5/3/2022,Zack

How I need it to look:

1,Intern,
1,1/8/2023,Bob
1,5/3/2022,Alice
1,7/25/2022,Charles
2,Assistant,
2,1/8/2023,Heather
2,5/3/2022,Harold
2,7/25/2022,Dave
3,Manager,
3,1/1/2023,Tim
3,1/8/2023,Lyon
3,5/3/2022,Greg
3,7/25/2022,Tyler
5,Head Manager,
5,1/8/2023,Charles
5,5/3/2022,Zack

I'm thinking there may be some way to edit them in the shell script that calls the sqlldr command, such as sed or awk.


Solution

  • With an input file (input.csv) as shown in the question, the command

    awk -F, -v 'OFS=,' '{ if($1!="") x=$1; else $1=x } 1' input.csv
    

    prints

    1,Intern,
    1,1/8/2023,Bob
    1,5/3/2022,Alice
    1,7/25/2022,Charles
    2,Assistant,
    2,1/8/2023,Heather
    2,5/3/2022,Harold
    2,7/25/2022,Dave
    3,Manager,
    3,1/1/2023,Tim
    3,1/8/2023,Lyon
    3,5/3/2022,Greg
    3,7/25/2022,Tyler
    5,Head Manager,
    5,1/8/2023,Charles
    5,5/3/2022,Zack
    

    Explanation:

    • -F, set input field separator to ,
    • -v 'OFS=,' set output field separator ,
    • if($1!="") x=$1 save non-empty value from column 1
    • else $1=x replace empty value with saved value
    • 1 always-true condition with default action print (shortcut for {print})