Search code examples
awkcsh

How to print list item use by user using awk


I have 2 file input file
input1 is username, tool and number:

######
User: a
#####
aem       12
aqwt      24
#####
User: B
#####
aem       34
bwem      52
dd        12
#####
User: C
#####
aem       11
aqwt      2
dd        1
##### 

input2 is list of tool:

aem
aqwt
bwem
dd

I want to make output of user and tool and if don't have tool it will assign 0:

Tool  a   B   C
aem   12  34  11
aqwt  24  0   2
bwem  0   52  0
dd    0   12  1

I have try awk to compare file2 and file1 to print the number but the script will print also line in comment:

awk 'NR==FNR{a[$1];next} ($1 in a); {print $2} !($1 in a); {print "0"}' input2 input1

the output will be:

0
0
0
12
24
0
0
0
34
52
12
0
0
0
11
2
1
0

Anyone know how to seperate it by column and make it skip the comment line? im new in this languag, thank you


Solution

  • Would you please try the following:

    awk '
        NR==FNR {                                   # process "input2"
            tool[FNR] = $0                          # store the tool name in an array "tool"
            ntool = FNR                             # count of tools
            next                                    # skip the following statements
        }
    
        /^User:/ {                                  # "User" line in "input1"
            gsub(/[^:]+: */, "")                    # extract the username
            user[++nuser] = $0                      # store the user name in an array "user"
            next
        }
    
        !/^#/ {                                     # non-comment line in "input1"
            num[$1, nuser] = $2                     # store the number in an array "num"
        }
    
        END {
            printf "Tool"                           # print the header line
            for (j = 1; j <= nuser; j++) {          # print the user names
                printf("\t%s", user[j])
            }
            print ""
    
            for (i = 1; i <= ntool; i++) {          # print the body lines
                printf("%s", tool[i])               # print the tool name
                for (j = 1; j <= nuser; j++) {
                    printf("\t%d", num[tool[i], j]) # print the number indexed by tool name and nuser
                }
                print ""
            }
        }
    ' input2 input1
    

    Output:

    Tool    a       B       C
    aem     12      34      11
    aqwt    24      0       2
    bwem    0       52      0
    dd      0       12      1
    

    As the elements of array num[] is printed by using the %d format, the undefined value is printed as 0.

    Explanation of gsub(/[^:]+: */, ""):

    • If the format of the User line is assured that it has a whitespace after the colon such as User: a, it is easy to extract the user name by just using the variable $2.
    • I may be overthinking but I considered the case there are no whitespaces after the colon such as User:a.
    • The first argument /[^:]+: */ in the statement gsub(/[^:]+: */, "") is a regex which matches non-colon characters followed by a colon and possible (0 or more) whitespaces.
    • The gsub() statement replaces the above matched subsring with the second argument "" (null string) then the remaining portion is the name what we want, which is held by $0.
    • The appropriate function might be sub(), not gsub(), although there is no difference in this case.