Search code examples
arraysbashcsvfor-loopawk

Printing command by reading CSV file with AWK into a bash array


I have the following CSV file (mycsv.csv):

love,hate,--command yes this is bad --other-command
fox,duck,--command yes this is bad --other-command
turtle,rabbit

This means the command would be --command "command_argument" --other-command

Then I'm creating an array:

my_csv="/home/sometimesihatebash/mycsv.csv"
first_column=( $(awk -F "\"*,\"*" '{print $1}' $my_csv) )
second_column=( $(awk -F "\"*,\"*" '{print $2}' $my_csv) )
IFS=$'\n' command=( $(awk -F "\"*,\"*" '{print $3}' $my_csv) )

Then printing all columns into a command

for i in "${!first_column[@]}"; do
    echo "${first_column[i]}"
    echo "${second_column[i]}"
    echo "${command[i]}"
    
    execute_program_name --first "${first_column[i]}" --second "${second_column[i]}" "${command[i]}"

done

But then, for the third row (which has an empty third column), I get:

error: 1 unrecognized argument:
'

And for the first two rows, I get:

no such option: --command yes this is bad --other-command

When echoing each array element, I don't see any problems. Pasting the exact output of echo also runs the program normally.

Running the for loop with just the first_column and second_column arrays works fine

Probably this has something to do with special characters in the last array but I`m lost

Any ideas?


Solution

  • Assumptions:

    • the input string --command yes this is bad --other-command needs to be passed to execute_program_name in the following format: --command 'yes this is bad' --other-command
    • the 3rd column consists of long form options (ie, options always start with --)
    • if the 3rd column is populated the first string will always be a long form option (ie, string will always have the format --option)
    • the contents of mycsv.csv only need to be parsed for this one for loop; net result: we'll only need a single array (args[]); (see 2nd half of answer for an approach that allows for a single parse and multiple re-use)

    We'll make some changes to OP's inputs to demonstrate handling options with and without arguments:

    $ cat mycsv.csv
    love,hate,--command yes this is bad --other-command
    fox,duck,--command yes this is bad --other-command single_string
    turtle,rabbit 
    sweet,sour,--new-command --newer-command
    

    A simple script to list input parameters:

    $ cat execute_program_name
    #!/bin/bash
    
    echo "### $0 $@"
    cnt=0
    
    for arg in "$@"
    do
        ((cnt++))
        echo "arg #${cnt} : ${arg}"
    done
    

    One bash approach

    $ cat parse_n_run
    #!/bin/bash
    
    while IFS=, read -r c1 c2 c3
    do
        printf "\n########### %s : %s : %s\n\n" "${c1}" "${c2}" "${c3}"
    
        args=( "--first" "${c1}" "--second" "${c2}" )       # store first 2 sets of options in array
    
        while read -r option string
        do
            [[ -z "${option}" ]] && continue                # skip first line (blank)
            args+=( "${option}" )                           # add "--option" to array
            [[ -n "${string}" ]] && args+=( "${string}" )   # if there's an argument then add to the array
    
        done <<< "${c3//--/$'\n'&}"                         # split c3 into lines by placing a "\n" before each "--"
    
        typeset -p args                                     # display contents of array
        echo ""
        execute_program_name "${args[@]}"
    done < mycsv.csv
    

    Taking for a test drive:

    $ parse_n_run
    
    ########### love : hate : --command yes this is bad --other-command
    
    declare -a args=([0]="--first" [1]="love" [2]="--second" [3]="hate" [4]="--command" [5]="yes this is bad" [6]="--other-command")
    
    ### ./execute_program_name --first love --second hate --command yes this is bad --other-command
    arg #1 : --first
    arg #2 : love
    arg #3 : --second
    arg #4 : hate
    arg #5 : --command
    arg #6 : yes this is bad
    arg #7 : --other-command
    
    ########### fox : duck : --command yes this is bad --other-command single_string
    
    declare -a args=([0]="--first" [1]="fox" [2]="--second" [3]="duck" [4]="--command" [5]="yes this is bad" [6]="--other-command" [7]="single_string")
    
    ### ./execute_program_name --first fox --second duck --command yes this is bad --other-command single_string
    arg #1 : --first
    arg #2 : fox
    arg #3 : --second
    arg #4 : duck
    arg #5 : --command
    arg #6 : yes this is bad
    arg #7 : --other-command
    arg #8 : single_string
    
    ########### turtle : rabbit  :
    
    declare -a args=([0]="--first" [1]="turtle" [2]="--second" [3]="rabbit ")
    
    ### ./execute_program_name --first turtle --second rabbit
    arg #1 : --first
    arg #2 : turtle
    arg #3 : --second
    arg #4 : rabbit
    
    ########### sweet : sour : --new-command --newer-command
    
    declare -a args=([0]="--first" [1]="sweet" [2]="--second" [3]="sour" [4]="--new-command" [5]="--newer-command")
    
    ### ./execute_program_name --first sweet --second sour --new-command --newer-command
    arg #1 : --first
    arg #2 : sweet
    arg #3 : --second
    arg #4 : sour
    arg #5 : --new-command
    arg #6 : --newer-command
    

    If OP needs to parse mycsv.csv and (re)use multiple times, and OP has access to bash 4.3+, I'd probably opt to use a nameref to manage arrays args_1[], args_2[], ..., args_n[].

    Making a few tweaks to the current script:

    $ cat parse_n_run
    #!/bin/bash
    
    cnt=0
    
    ##########
    # parse and store
    
    echo "Parsing ..."
    
    while IFS=, read -r c1 c2 c3
    do
        printf "\n########### %s : %s : %s\n\n" "${c1}" "${c2}" "${c3}"
    
        ((cnt++))
        declare -n _args="args_${cnt}"                          # name ref
    
        _args=( "--first" "${c1}" "--second" "${c2}" )
    
        while read -r option string
        do
            [[ -z "${option}" ]] && continue
            _args+=( "${option}" )
            [[ -n "${string}" ]] && _args+=( "${string}" )
        done <<< "${c3//--/$'\n'&}"
    
        typeset -p "${!_args}"
    done < mycsv.csv
    
    ### at this point we have 4 arrays: args_1[], args_2[], args_3[] and args_4[]
    
    ##########
    # make use of arrays
    
    printf "\n###########\n"
    echo "Using ..."
    
    for ((i=1; i<=cnt; i++))
    do
        declare -n _args="args_$i"
    
        printf "\n### %s\n\n" "${!_args}"
    
        execute_program_name "${_args[@]}"
    done
    

    Taking for a test drive:

    NOTE: notice the 4 declare -a lines reference our 4 arrays args_1[], args_2[], args_3[] and args_4[]

    $ parse_n_run
    Parsing ...
    
    ########### love : hate : --command yes this is bad --other-command
    
    declare -a args_1=([0]="--first" [1]="love" [2]="--second" [3]="hate" [4]="--command" [5]="yes this is bad" [6]="--other-command")
    
    ########### fox : duck : --command yes this is bad --other-command single_string
    
    declare -a args_2=([0]="--first" [1]="fox" [2]="--second" [3]="duck" [4]="--command" [5]="yes this is bad" [6]="--other-command" [7]="single_string")
    
    ########### turtle : rabbit  :
    
    declare -a args_3=([0]="--first" [1]="turtle" [2]="--second" [3]="rabbit ")
    
    ########### sweet : sour : --new-command --newer-command
    
    declare -a args_4=([0]="--first" [1]="sweet" [2]="--second" [3]="sour" [4]="--new-command" [5]="--newer-command")
    
    ###########
    Using ...
    
    ### args_1
    
    ### execute_program_name --first love --second hate --command yes this is bad --other-command
    arg #1 : --first
    arg #2 : love
    arg #3 : --second
    arg #4 : hate
    arg #5 : --command
    arg #6 : yes this is bad
    arg #7 : --other-command
    
    ### args_2
    
    ### execute_program_name --first fox --second duck --command yes this is bad --other-command single_string
    arg #1 : --first
    arg #2 : fox
    arg #3 : --second
    arg #4 : duck
    arg #5 : --command
    arg #6 : yes this is bad
    arg #7 : --other-command
    arg #8 : single_string
    
    ### args_3
    
    ### execute_program_name --first turtle --second rabbit
    arg #1 : --first
    arg #2 : turtle
    arg #3 : --second
    arg #4 : rabbit
    
    ### args_4
    
    ### execute_program_name --first sweet --second sour --new-command --newer-command
    arg #1 : --first
    arg #2 : sweet
    arg #3 : --second
    arg #4 : sour
    arg #5 : --new-command
    arg #6 : --newer-command