I have the following sample.txt file:
2021-10-07 10:32:05,767 ERROR [LAWT2] blah.blah.blah - Message processing FAILED: <ExecutionReport blah="xxx" foo="yyy" SessionID="kkk" MoreStuff="zz"> Total time for which application threads were stopped: 0.0003858 seconds, Stopping threads took: 0.0000653 seconds
2021-10-07 10:31:32,902 ERROR [LAWT6] blah.blah.blah - Message processing FAILED: <NewOrderSingle SessionID="zkx" TargetSubID="ttt" Account="blah" MsgType="D" BookingTypeOverride="0" Symbol="6316" OtherField1="othervalue1" Otherfield2="othervalue2"/></D></NewOrderSingle>
I want to grab just two key fields: "SessionID" and "MsgType" and print like this:
SessionID="kkk"|
SessionID="zkx"|MsgType="D"
In other words: if the group match is not there, I want just to print blank.
I've tried the following approach but no luck:
$$ perl -ne '/ (SessionID=".*?")? .*(MsgType=".*?")? / and print "$1|$2\n"' sample.txt
SessionID="kkk"|
SessionID="zkx"|
Can somebody enlighten me here? Thank you a lot.
You can use
perl -ne '/\h(SessionID="[^"]*")?(?:\h++.*(MsgType="[^"]*"))?\h/ and print "$1|$2\n"'
See the regex demo. Details:
\h
- a horizontal whitespace(SessionID="[^"]*")?
- Group 1: an optional SessionID="
, any zero or more chars other than "
, and then a "
(?:\h++.*(MsgType=".*?"))?
- an optional (but greedy) sequence of
\h++
- one or more horizontal whitespaces.*
- any zero or more chars other than line break chars as many as possible(MsgType="[^"]*")
- Group 2: SessionID="
, any zero or more chars other than "
, and then a "
\h
- a horizontal whitespace.See the online demo:
s='2021-10-07 10:32:05,767 ERROR [LAWT2] blah.blah.blah - Message processing FAILED: <ExecutionReport blah="xxx" foo="yyy" SessionID="kkk" MoreStuff="zz"> Total time for which application threads were stopped: 0.0003858 seconds, Stopping threads took: 0.0000653 seconds
2021-10-07 10:31:32,902 ERROR [LAWT6] blah.blah.blah - Message processing FAILED: <NewOrderSingle SessionID="zkx" TargetSubID="ttt" Account="blah" MsgType="D" BookingTypeOverride="0" Symbol="6316" OtherField1="othervalue1" Otherfield2="othervalue2"/></D></NewOrderSingle>'
perl -ne '/\h(SessionID=".*?")?(?:\h++.*(MsgType=".*?"))?\h/ and print "$1|$2\n"' <<< "$s"
This prints:
SessionID="kkk"|
SessionID="zkx"|MsgType="D"