Search code examples
pythonbashawkgrep

Analyze logs. How to filter by referce number


I have got following log.txt

log 1111 xyzqwe
log 1111 xyzrty
aaa
bbb
ccc
ttt
log 2222 xyzuio
log 1111 xyzopa
iii
uuu
yyy
log 1111 xyzsdf
log 2222 xyzghj
aaa
ddd
log 2222 xyzklz
log 1111 xyzxcv
ffff
log 2222 xyzbnm
log 1111 xyzert
log 1111 xyzuyt

I need to get following results.

for reference 1111 

log 1111 xyzqwe
log 1111 xyzrty
aaa
bbb
ccc
ttt
log 1111 xyzopa
iii
uuu
yyy
log 1111 xyzsdf
log 1111 xyzxcv
ffff
log 1111 xyzert
log 1111 xyzuyt

for reference 2222

log 2222 xyzuio
log 2222 xyzghj
aaa
ddd
log 2222 xyzklz
log 2222 xyzbnm

How can I fix it by python|bash|grep|awk?


Solution

  • Here's a solution using awk which saves the scope (interpreted as second word in a line that starts with log ), and compares it to the reference (provided as argument using the -v option):

    awk -v ref=1111 '/^log / {scope = $2} ref == scope' log.txt
    
    log 1111 xyzqwe
    log 1111 xyzrty
    aaa
    bbb
    ccc
    ttt
    log 1111 xyzopa
    iii
    uuu
    yyy
    log 1111 xyzsdf
    log 1111 xyzxcv
    ffff
    log 1111 xyzert
    log 1111 xyzuyt
    

    Here's the very same approach, rewritten in bash:

    ref=1111
    while read -r line
    do
      read -r cmd id _ <<< "$line"
      [[ "$cmd" == log ]] && scope="$id"
      [[ "$ref" == "$scope" ]] && printf '%s\n' "$line"
    done < log.txt