Search code examples
ubuntuawkmingw-w64

Awk: ilegal variable reference error in Ubuntu but works OK in MinGW-64


This script runs flawlessly in MinGW-64 but it fails in Ubuntu with

awk: /home/username/chuleta/chuleta/glst.awk: line 22: illegal reference to variable a

awk: /home/username/chuleta/chuleta/glst.awk: line 23: illegal reference to variable a

...even after changing the shebang from #!/usr/bin/gawk -f to #!/usr/bin/awk -f to discard it being an implementation difference between GNU Awk and non-GNU Awk.

#!/usr/bin/awk -f

# generates list of terms related to a topic

# this funcion substitutes a with b in $0
func change(a,b) {
    while (i=index($0,a))
            $0 = substr($0,1,i-1) b substr($0,i+length(a))
}

{
    # we delete the base folder (received in var RTO) from the input line
    # and clean it of other stuff
    change(RTO,"")
    change(".txt","")
    change("/"," ")
    change("_"," ")
    split($0,a," ")
    # using arr as associative array by using same string as index
    # this prevents duplicated strings and code is shorter
    # since we don't have to maintain a counter for index
    for (x in a)  # this is line 22
        arr[a[x]]=a[x] # this is line 23
}

END{
    n=asort(arr,sarr)
    for (x in sarr)
        printf("%s ",sarr[x])
}

Variable a is supposed to be an array resulting from splitting $0 using space as a delimited.

What is happening? As already mentioned, this error doesn't happen in WinGW-64 (the Linux bash emulation used by Git for Windows).

EDIT:

An example data set that is piped thru that Awk script is as follows:

/home/username/chuleta/chuleta-data/java/8/chuleta_foreach_loop.txt
/home/username/chuleta/chuleta-data/java/8/chuleta_basic_functional_interfaces_in_function_package.txt
/home/username/chuleta/chuleta-data/java/8/chuleta_advantages_new_date_time_api.txt
/home/username/chuleta/chuleta-data/java/8/chuleta_new_date_time_api.txt
/home/username/chuleta/chuleta-data/java/8/chuleta_solve_interface_default_methods_conflict.txt
/home/username/chuleta/chuleta-data/java/8/chuleta_lambdas_with_parameters.txt

RTO passed to the script holding the base dir (home/username/chuleta/chuleta-data/), then removes it from the beginning, then removes ".txt", changes "/" and "-" into spaces and then the resulting space-separated words list is split into an array ("a") which is later used to populate an associative array. Basically we get a space-separated list of subfolder names excluding the base dir, for example "java 8 chuleta foreach loop". Resulting a array contains those five items. Those code lines where a is traversed are the ones that fail only in Ubuntu but not in MinGW-64.

EDIT 2:

Output in Ubuntu 22.04, awk version mawk 1.3.4 20200120, GNU bash, version 5.1.16:

$ cat data.txt | ./program.awk -v RTO=/home/username/chuleta/chuleta-data
awk: ./prueba.sh: line 18: illegal reference to variable a
awk: ./prueba.sh: line 22: illegal reference to variable a
awk: ./prueba.sh: line 23: illegal reference to variable a
awk: ./prueba.sh: line 23: illegal reference to variable a

Output in MINGW64_NT-10.0-19044 3.1.7-340, awk version GNU Awk 5.0.0, API: 2.0, GNU bash, version 4.4.23:

$ cat data.txt | ./program.awk -v RTO=/home/username/chuleta/chuleta-data
8 advantages api basic chuleta conflict date default foreach function functional in interface interfaces java lambdas loop methods new package parameters solve time with

Solution

  • Your script uses GNU Awk features which are not available in Mawk.

    The simple fix is probably apt-get install -y gawk in your Ubuntu installation, and make sure you update-alternatives to ensure that awk points to gawk.

    In some more detail, renaming a to k in the main function removes the immediate problem with Mawk, but it then complains that asort is never defined. (This function is a GNU Awk extension.)