Need some help to return all data in a log file within 2 specific delimiters. We usually have logs like the one below:
2018-04-17 03:59:29,243 TRACE [xml] This is just a test.
2018-04-17 13:22:24,230 INFO [properties] I believe this is another test.
2018-04-18 03:48:07,043 ERROR [properties] (Thread-13) UpdateType: more data coming here; ProcessId: 5010
2018-04-17 13:22:24,230 INFO [log] I need to retrieve this string here
and also this one as it is part of the same text
2018-04-17 13:22:24,230 INFO [det] I believe this is another test.
If I grep "here" I just get the line including the word but I actually need to retrieve the whole text, the breaks are probably contributing to my problem also.
2018-04-17 13:22:24,230 INFO [log] I need to retrieve this string here
and also this one as it is part of the same text
We could have several "here" within the log file. I tried to do it through sed but I can't find the right way to use the delimiters which I think should be the whole DATE.
I really appreciate your help on this.
New example after Karakfa's comments
2018-04-17 03:48:07,044 INFO [passpoint-logger] (Thread-19) ERFG|1.0||ID:414d512049584450414153541541871985165165130312020203aa4b|Thread-19|||2018-04-17 03:48:07|out-1||out-1|
2018-04-17 03:59:29,243 TRACE [xml] (Thread-19) RAW MED XML: <?xml version="1.0" encoding="UTF-8" standalone="yes"?><MED:MED_PMT_Tmp_Notif xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://services.xxx.com/POQ/v01" xmlns:POQ="http://services.xxx.com/POQ/v01" xmlns:MED="http://services.xxx.com/MED/v1.2" version="1.2.3" messageID="15290140135778972043" Updat584ype="PGML" xsi:schemaLocation="http://services.xxx.com/MED/v1.2 MED_PMT_v.1.2.3.xsd">
<MED_Space xmlns:ns2="http://services.xxx.com/MED/v1.2" xmlns:ns4="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns3="http://services.xxx.com/POQ_Header/v01" status="AVAIL" dest="MQX" aircraftType="DH8" aircraftConfig="120">
<Space_ID partition="584" orig="ADD3" messageCreate="2018-04-17T03:59:29.202-05:00">
<Space carrier="584" date="2018-04-18">0108</Space>
</Space_ID>
<DepartAndArrive estDep="2018-04-18T18:10:00+03:00" schedDep="2018-04-18T18:10:00+03:00" estArrival="2018-04-18T19:30:00+03:00" schedArrival="2018-04-18T19:30:00+03:00"/>
<Sched_OandD orig="ADD3" dest="MQX"/>
</MED_Space>
<TRX_Record xmlns:ns2="http://services.xxx.com/MED/v1.2" xmlns:ns4="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns3="http://services.xxx.com/POQ_Header/v01">
<TRX_ID FILCreate="2018-04-17T03:59:00-05:00" resID="1">TFRSVL</TRX_ID>
<Space>
<Inds revenue="1"/>
<Identification nameID="1" dHS_ID="TFRSVL001" gender="X">
<Name_First>SMITH MR</Name_First>
<Name_Last>P584ER</Name_Last>
<TT tier="0"/>
</Identification>
<TRXType>F</TRXType>
<SRiuyx>0</SRiuyx>
<GroupRes>1</GroupRes>
<SystemInstances inventory="H">Y</SystemInstances>
<OandD_FIL orig="ADD3" dest="MQX"/>
<Store="584">0108</Store>
<CodingSpec="584">0108</CodingSpec>
</Space>
</TRX_Record>
<ns2:TRX_Count xmlns:ns2="http://services.xxx.com/MED/v1.2" xmlns:ns4="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns3="http://services.xxx.com/POQ_Header/v01">1</ns2:TRX_Count>
<ns2:Transaction_D584ails xmlns:ns2="http://services.xxx.com/MED/v1.2" xmlns:ns4="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns3="http://services.xxx.com/POQ_Header/v01" sourceID="TPF">
<Client_Entry_Info authRSX="54" agx="S4" code="ADD3">RESTORE AMEND:NEW-FIL/AFAX-UPDATED</Client_Entry_Info>
</ns2:Transaction_D584ails>
</MED:MED_PMT_Tmp_Notif>
2018-04-17 03:59:29,244 INFO [properties] (Thread-19) Updat584ype: PGML ; ProcessId: ##MISSING##
The entry below is not returning the whole text: awk -v RS='(^|\n)[0-9 :,-]+' '/TFRSVL/{print rs,$0} {rs=RT}' file
with GNU awk
multi-char record separator
$ awk -v RS='(^|\n)[0-9 :,-]+' '/here/{print rs,$0} {rs=RT}' file
2018-04-18 03:48:07,043 ERROR [properties] (Thread-13) UpdateType: more data coming here; ProcessId: 5010
2018-04-17 13:22:24,230 INFO [log] I need to retrieve this string here
and also this one as it is part of the same text
NB Here I cheated by creating the record separator that uses the values in the time stamp. You can formulate it exactly to eliminate false positives ending up on the start of the second line. Or, perhaps add the debug levels to the match as well.