Search code examples
awkfind

Can awk find a field containing a string from a list?


I have a file containing different fields. I have another file containing a list of different words. I need to use awk command to extract from my 1st file all records where a specific field contains one or different words from my 2nd file.

For example 1st file:

Feb 15 12:05:10 lcif adm.slm: root [23416]: cd /tmp
Feb 15 12:05:24 lcif adm.slm: root [23416]: cat tst.sh
Feb 15 12:05:44 lcif adm.slm: root [23416]: date
Feb 15 12:05:52 lcif adm.pse: root [23419]: rm -f file
Feb 15 12:05:58 lcif adm.pse: root [23419]: who
Feb 15 12:06:02 lcif adm.pse: root [23419]: uptime
Feb 15 12:06:56 lcif adm.pse: root [23419]: reboot
Feb 15 12:06:58 lcif adm.pse: root [23419]: ls -lrt

For example 2nd file:

rm
reboot
shutdown

Then awk command should returns:

Feb 15 12:05:52 lcif adm.pse: root [23419]: rm -f file
Feb 15 12:06:56 lcif adm.pse: root [23419]: reboot

Tried deperatly with array/map.

Tried this to:

awk -F ": " '{if ($3 ~ "^rm" || $3 ~ "^reboot" || $3 ~ "^shutdown") print}'

But the list of words I'm looking for is getting bigger and bigger. I'd rather use a file list.

Appreciate any help.

Thank you ! Serge


Solution

  • don't waste time with arrays. just dynamically generate hard-coded regex on the fly :

    printf '%s' "${file_a}" | 
    
    gawk -p-             -b 'BEGIN { FS = "[]]: " } '"$(
    
     awk -v  __="${file_b}" 'BEGIN { 
    
        FS = RS  ;    OFS = "|"
        RS = "^$"; _= ORS =  ""
    
        $_ = __
    
        print "$NF ~ \"^(" $(_*(NF-=_==$NF)) ")( |$)\"" }' )"
    
    Feb 15 12:05:52 lcif adm.pse: root [23419]: rm -f file
    Feb 15 12:06:56 lcif adm.pse: root [23419]: reboot
    
    # this part being dynamically generated
    
    awk 'BEGIN { FS = "[]]: " } $NF ~ "^(rm|reboot|shutdown)( |$)" ' 
    

    then instead of looping through an array, it'll be a high speed single pass through file A without having to store any rows in between