Search code examples
awkdata-processing

How can I filter by value for the multi valued column using awk?


I would like to use awk to filter out a multi valued column.

My data is two columned with the delimiter ;. The second column has three float values separated with white spaces.

randUni15799:1;0.00 0.00 0.00
randUni1785:1;0.00 0.00 0.00
randUni18335:1;0.00 0.00 0.00
randUni18368:1;223.67 219.17 0.00
randUni18438:1;43.71 38.71 1.52

What I want to achieve is the following. I want to filter all rows that the first and second value of the second column is bigger than 200.

randUni18368:1;223.67 219.17 0.00

Update: With help from the comments, I tried this and worked

awk -F ";" '{split($2, a, " "); if (a[1] > 200 && a[2] > 200) print}'

Solution

  • One awk idea:

    awk -F';' '{ n=split($2,a,/[[:space:]]+/)            # split 2nd field on spaces; place values in array a[]
                 if (a[1] > 200 && a[2] > 200)           # if 1st and 2nd array entries > 200 then ...
                    print                                # print current line to stdout
               }
    ' randum.dat
    
    # or as a one-liner
    
    awk -F';' '{ n=split($2,a,/[[:space:]]+/); if (a[1] > 200 && a[2] > 200) print}' randum.dat
    
    # reduced further based on OP's comments/questoins:
    
    awk -F';' '{ split($2,a," "); if (a[1] > 200 && a[2] > 200) print}' randum.dat
    

    This generates:

    randUni18368:1;223.67 219.17 0.00