Search code examples
sqlarraysbashsql-insertquoting

How to auto-quote "string values" of a SQL INSERT STATEMENT in Bash script


I have written a simple Bash script that generates a SQL INSERT STATEMENT for multiple rows from CSV data entirely comprised of integers and saves it to a text file (so that I can then copy and paste it wherever I want).

Now I want to know how to auto-quote string values; i.e. automatically wrap those columns of the CSV whose data type is string and would need "" in order to be inserted into a SQL table. As you can see below, the script reads the entire CSV row and wouldn't differentiate between columns. Would I have to assign each column to a variable and then quote the string ones separately? Or is there a more efficient way? Either way, I'd be grateful to see your fixes for the above problem!

#!/bin/bash

echo Path to to-be-imported CSV:
read csv_file
echo Table name to import into:
read table

echo "INSERT INTO $table VALUES" > SQL_INSERT_$table.txt
while read line
do
 echo "($line),"
done < <(tail -n +2 $csv_file) >> SQL_INSERT_$table.txt && sed -i '' '$ s/.$/;/' SQL_INSERT_$table.txt

Solution

  • This is the modified version of the script above that successfully auto-quotes string columns:

    #!/bin/bash
    
    # Prompt for input and: 1. enter CSV path to be imported into DB; 2. its equivalent table name in the DB 
    echo Path to CSV:
    read CSV_file
    echo DB table name to import into:
    read DB_table
    
    # Create .txt file that will contain SQL INSERT STATEMENT and enter DB table name
    echo "INSERT INTO $DB_table VALUES" > SQL_INSERT_$DB_table.txt
    
    # Auto-quote string columns in the .txt file, leaving other columns intact
    awk -F, 'OFS=FS {for (i=1;i<=NF;i++) {if (match($i, /^[0-9.-]+$/)==0) {printf "\"" $i "\""} else {printf $i}; if (i<NF) printf OFS}; printf "\n"}' $CSV_file > temp.txt
    
    # read-while loop to populate INSERT STATEMENT row values from CSV (2nd row to the end) and replace final comma with semicolon for those RDBMS's that require a concluding semicolon at the end of SQL STATEMENT
    while read line
    do
     echo "($line),"
    done < <(tail -n +2 temp.txt) >> SQL_INSERT_$DB_table.txt && sed -i '' '$ s/.$/;/' SQL_INSERT_$DB_table.txt
    
    # Delete temporary .txt file that contained auto-quoted string values
    rm temp.txt
    

    This is a more complete version of the script that not only auto-quotes string columns but also lists out column names into the SQL INSERT STATEMENT:

    #!/bin/bash
    
    # Prompt for input and: 1. enter CSV path to be imported into DB; 2. its equivalent table name in the DB 
    echo Path to CSV:
    read CSV_file
    echo DB table name to import into:
    read DB_table
    
    # Create .txt file that will contain SQL INSERT STATEMENT and enter DB table name
    echo "INSERT INTO $DB_table (" > SQL_INSERT_$DB_table.txt
    
    # List out CSV header as INSERT STATEMENT column names and append to .txt file
    echo "`head -n 1 $CSV_file`)" >> SQL_INSERT_$DB_table.txt
    
    # Auto-quote string columns in the .txt file, leaving other columns intact
    awk -F, 'OFS=FS {for (i=1;i<=NF;i++) {if (match($i, /^[0-9.-]+$/)==0) {printf "\"" $i "\""} else {printf $i}; if (i<NF) printf OFS}; printf "\n"}' $CSV_file > temp.txt
    echo "VALUES" >> SQL_INSERT_$DB_table.txt
    
    # read-while loop to populate INSERT STATEMENT row values from CSV (2nd row to the end) and replace final comma with semicolon for those RDBMS's that require a concluding semicolon at the end of SQL STATEMENT
    while read line
    do
     echo "($line),"
    done < <(tail -n +2 temp.txt) >> SQL_INSERT_$DB_table.txt && sed -i '' '$ s/.$/;/' SQL_INSERT_$DB_table.txt
    
    # Delete temporary .txt file that contained auto-quoted string values
    rm temp.txt