Search code examples
arraysawkvariable-assignment

How to choose which array to loop through before running awk for loop?


Using awk (not gawk) I would like to iterate through one of two arrays, depending on the value of a separate variable. One array contains a subset of the data in the other. I have a large for loop so would prefer not to just use if() { for() {} } else { for() {} } syntax and double the length of my code. I want to use the same for loop on only one of the two arrays.

Do I have to use two for loops? I also don't really want to iterate any more values than I have to.

Here I was hoping either of the two arrays to be selected depending on the value of independent_variable, but this code gives a syntax error at the first of the enclosed brackets (and besides, there wouldn't be a concrete array name to reference in the loop):

for (i in (independent_variable ? array1 : array2)) {
    # process the selected array
}

Otherwise, is it not possible to create a new temporary array from one of the original ones, without using for loops to populate the temporary array? This gives an error saying that I'm treating the arrays like scalars:

if (independent_variable) { relevant_array = array1 }
else { relevant_array = array2 }

for (i in relevant_array)
{
    # process the selected array
}

EDIT: Here is my code that calls a function containing the for loop, using a ternary operator. It doesn't work (error message below):

ternary_exp_for.awk

function do_loop(arr,   i) {
    for (i in arr) {
        print "within function: " i arr[i]
    }
}

END {
    array1[1]="abc"
    array1[2]="def"
    array2[1]="ABC"
    array2[2]="DEF"
    for (i in array1) {
        print "outside function, array1: " array1[i]
    }
    for (j in array2) {
        print "outside function, array2: " array2[j]
    }
    
    independent_variable=0
    print "\nwith independent_variable == " independent_variable
    do_loop(independent_variable ? array1 : array2)
    
    independent_variable=1
    print "\nwith independent_variable == " independent_variable
    do_loop(independent_variable ? array1 : array2)
}

Running awk -f ternary_exp_for.awk <<< "test" gives me this output, and error:

outside function, array1: abc
outside function, array1: def
outside function, array2: ABC
outside function, array2: DEF

with independent_variable == 0
awk: ternary_exp_for.awk:21: (FILENAME=- FNR=1) fatal: attempt to use array `array2' in a scalar context

Solution

  • You can't save the name of an array in a scalar variable, nor can you create a pointer to an array, nor can you copy an array by assignment.

    Containing common code is one of the reasons functions exist so put the loop in a function and call it with the appropriate array name, e.g.:

    function do_loop(arr,     i) {
        for ( i in arr ) {
            print i, arr[i]
        }
    }
    

    Call it using a ternary expression:

    { independent_variable ? do_loop(array1) : do_loop(array2) }
    

    or just an if-else:

    { if (independent_variable) do_loop(array1); else do_loop(array2) }