Search code examples
awkgawk

Log analysis using AWK


I am looking for some help with a log analysis problem over which I am cracking my head for some time. I have a log file which contains logs from multiple processes but it is not in sorted order. Generally each line in the log file start with the process id but there are some cases where an entry spans over multiple lines as shown below

90234  abcd 
90234  pqrs
98765  nbnbbb
34072  tabt
90234  stuv        -|
       tttt         |- entry spanning over multiple lines
       gggg        -|
34072  yyyy
98765  tytyy

So my task is to extract all logs for a given pid.

Given a pid the output is expected in following format:

For pid 90234:

90234  abcd 
90234  pqrs
90234  stuv
       tttt
       gggg

For pid 34072:

34072  tabt
34072  yyyy

For pid 98765:

98765  nbnbbb
98765  nbnbbb
98765  tytyy

Would really appreciated any help but as I want to do this using AWK so lets all try and stick to AWK alone. Thank you all so much in advance.


Solution

  • alternative awk since number of fields may not be constant in a log file

    $ awk '/^[0-9]+/{p=$1} p==90234' log
    
    90234  abcd
    90234  pqrs
    90234  stuv        -|
          tttt         |- entry spanning over multiple lines
          gggg        -|
    

    you can make the pid a variable as in @peak's example.