I'm looking for a command line way (on SunOS) to pull from a log file xml messages which contain a particular string.
For example, the log file might contain xml messages of the form:
<message>
<body>
<tags> uniqueId="123456" </tags>
</body>
</message>
Alongside other timestamped log lines. There might be several xml messages which contain the same ID as the same record may have been run several times.
To pull out the xmls currently I have this awk command:
nawk '$0~s{for(c=NR-b;c<=NR+a;c++)r[c]=1}{q[NR]=$0}END{for(c=1;c<=NR;c++)if(r[c])print q[c]}' b=4 a=15 s="someUniqueId" file
The problem I have is this pulls out a specific number of lines. However, the xmls may vary in length and I'm struggling to find a way to modify this so that it finds the unique ID and pulls all lines up until "<message>"
and all lines down until "</message>"
This probably works in the perfect world (if I understood your question right):
$ cat file
<message>
<body>
<tags> uniqueId="123455" </tags>
</body>
</message>
<message>
<body>
<tags> uniqueId="123456" </tags> # the one we want
</body>
</message>
<message>
<body>
<tags> uniqueId="123457" </tags>
</body>
</message>
The awk:
$ awk '
{
b=b ORS $0 # buffer records
}
/<message>/ {
b=$0 # reset buffer
}
/<\/message>/ && b~/uniqueId="123456"/ { # if condition met at the end marker
print b # output buffer
}' file
Output:
<message>
<body>
<tags> uniqueId="123456" </tags> # the one we wanted
</body>
</message>