I am looking for some help with a log analysis problem over which I am cracking my head for some time. I have a log file which contains logs from multiple processes but it is not in sorted order. Generally each line in the log file start with the process id but there are some cases where an entry spans over multiple lines as shown below
90234 abcd
90234 pqrs
98765 nbnbbb
34072 tabt
90234 stuv -|
tttt |- entry spanning over multiple lines
gggg -|
34072 yyyy
98765 tytyy
So my task is to extract all logs for a given pid.
Given a pid the output is expected in following format:
For pid 90234:
90234 abcd
90234 pqrs
90234 stuv
tttt
gggg
For pid 34072:
34072 tabt
34072 yyyy
For pid 98765:
98765 nbnbbb
98765 nbnbbb
98765 tytyy
Would really appreciated any help but as I want to do this using AWK so lets all try and stick to AWK alone. Thank you all so much in advance.
alternative awk
since number of fields may not be constant in a log file
$ awk '/^[0-9]+/{p=$1} p==90234' log
90234 abcd
90234 pqrs
90234 stuv -|
tttt |- entry spanning over multiple lines
gggg -|
you can make the pid a variable as in @peak's example.