Search code examples
arraysawkgawkmedianasort

Issue with AWK array length?


I have a tab separated matrix (say filename).

If I do:

head -1 filename | awk -F "\t" '{i=0;med=0;for(i=2;i<=NF;i++) array[i]=$i;asort(array);print length(array)}'

followed by:

head -2 filename | tail -1 | awk -F "\t" '{i=0;med=0;for(i=2;i<=NF;i++) array[i]=$i;asort(array);print length(array)}'

I get an answer of 24 (same answer) for all rows basically.

But if I do it:

cat filename | awk -F "\t" '{i=0;med=0;for(i=2;i<=NF;i++) array[i]=$i;asort(array);print length(array)}'

I get:

24
25
25
25
25 ...

Why is it so?

Following is the inputfile:

Case1   17.49   0.643   0.366   11.892  0.85    5.125   0.589   0.192   0.222   0.231   27.434  0.228   0   0.111   0.568   0.736   0.125   0.038   0.218   0.253   0.055   0.019   0   0.078  
Case2   0.944   2.412   4.296   0.329   0.399   1.625   0.196   0.038   0.381   0.208   0.045   1.253   0.382   0.111   0.324   0.268   0.458   0.352   0   1.423   0.887   0.444   5.882   0.543  
Case3   21.266  14.952  24.406  10.977  8.511   21.75   6.68    0.613   12.433  1.48    1.441   21.648  6.972   42.931  8.029   4.883   11.912  6.248   4.949   26.882  9.756   5.366   38.655  12.723  
Case4   0.888   0   0.594   0.549   0.105   0.125   0   0   0.571   0.116   0.019   1.177   0.573   0.111   0.081   0.401   0   0.05    0.073   0   0   0   0   0.543

Solution

  • Well, I found an answer to my own problem:

    I wonder how I missed it, but nullifying the array at the end of each initiation is always critical for repeated usage of same array name (no matter which language/ script one uses).

    correct awk was:

    cat filename | awk -F "\t" '{i=0;med=0;for(i=2;i<=NF;i++) array[i]=$i;asort(array);print length(array);delete array}'