Search code examples
splunksplunk-query

Extract some fields from a part json part text log in Splunk


I am fairly new to splunk and still learning. I have a splunk event which is a mix of some texts and json in between. (This isn't the complete log)

2021-02-14 00:00:03,596 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.RetrieveDataFromDQ - Total Application assets -> 1692
2021-02-14 00:00:03,596 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.CommonUtils - {"Header":{"AppId":"DFG686","Type":"Inbound","RecId":"416c627c-41a7-428e-a871-5317c4842fe5","StartTS":"2021-02-14T05:00Z","Ver":"2.0.0"},"Application":{"APP_OS":"Linux 3.10.0-1160.11.1.el7.x86_64","APP_Runtime":"Java 1.8.0_282","APP_AppName":"DQ-bapm","APP_AppVersion":"1.0.0","Host":"zebra.cdc.growl.com/10.102.180.53","Channel":"Other"},"Service":{"Key":"DQ2bapm","URL":"https://growl-test.DQ.com/rest/2.0/assets?limit=1000&offset=1000&typeId=00000000-0000-0000-0000-000000031302&communityId=595b27d3-ff42-45e4-8dc7-0172f7d82693&domainId=2c8b39ea-0d7f-445f-acc2-a1fb3a9a12db&statusId=00000000-0000-0000-0000-000000005009","CallType":"REST","Operation":"GET"},"Results":{"Elapsed":"0","TraceLevel":"DEBUG"},"Security":{"Vendor":"growl"}}
2021-02-14 00:00:03,795 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.RetrieveDataFromDQ - Total Application assets -> 1692
2021-02-14 00:00:03,795 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.RetrieveDataFromDQ - Total Application assets in appAssetList-> 1692
2021-02-14 00:00:04,499 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.bapm.ComparebapmDQRecords - List of Applications in DQ to be marked "Obsolete in bapm": 
[487684fgfg, hfkeh708089, fgdh678, SDF75664, dffg0007643]
2021-02-14 00:00:04,499 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.ComparebapmDQRecords - ## Total Application count from bapm ##1696
2021-02-14 00:00:04,499 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.ComparebapmDQRecords - ## Total Application Asset in DQ ##1692
2021-02-14 00:00:04,499 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.ComparebapmDQRecords - ## No of Application to Obsolete in DQ ##5

How can I extract the below :

List of Applications in DQ to be marked "Obsolete in bapm": 
[487684fgfg, hfkeh708089, fgdh678, SDF75664, dffg0007643]
Total Application count from bapm ##1696
Total Application Asset in DQ ##1692
No of Application to Obsolete in DQ ##5

I tried something like below which was suggested for one similar case in another post but this didn't work:

index=hdt sourcetype=dq2 |  rex field=_raw "(?msi)(?<Total Application count from BAPM>\{.+\}$)"
| spath input="Total Application count from BAPM"

Solution

  • I get the impression, perhaps wrongly, that you believe the rex command will search all events concurrently. That is not the next. Splunk processes each event individually and independently. The query must be prepared to find any of the target strings in any event.

    | makeresults 
    | eval data="2021-02-14 00:00:03,596 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.RetrieveDataFromDQ - Total Application assets -> 1692~
    2021-02-14 00:00:03,596 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.CommonUtils - {\"Header\":{\"AppId\":\"AD00006933\",\"Type\":\"Inbound\",\"RecId\":\"416c627c-41a7-428e-a871-5317c4842fe5\",\"StartTS\":\"2021-02-14T05:00Z\",\"Ver\":\"2.0.0\"},\"Application\":{\"APP_OS\":\"Linux 3.10.0-1160.11.1.el7.x86_64\",\"APP_Runtime\":\"Java 1.8.0_282\",\"APP_AppName\":\"DQ-bapm-Integration\",\"APP_AppVersion\":\"1.0.0\",\"Host\":\"zebra.cdc.growl.com/10.102.180.53\",\"Channel\":\"Other\"},\"Service\":{\"Key\":\"DQ2bapm\",\"URL\":\"https://growl-test.DQ.com/rest/2.0/assets?limit=1000&offset=1000&typeId=00000000-0000-0000-0000-000000031302&communityId=595b27d3-ff42-45e4-8dc7-0172f7d82693&domainId=2c8b39ea-0d7f-445f-acc2-a1fb3a9a12db&statusId=00000000-0000-0000-0000-000000005009\",\"CallType\":\"REST\",\"Operation\":\"GET\"},\"Results\":{\"Elapsed\":\"0\",\"Message\":\"Invoking DQ REST API\",\"TraceLevel\":\"DEBUG\"},\"Security\":{\"Vendor\":\"growl\"}}~
    2021-02-14 00:00:03,795 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.RetrieveDataFromDQ - Total Application assets -> 1692~
    2021-02-14 00:00:03,795 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.RetrieveDataFromDQ - Total Application assets in appAssetList-> 1692~
    2021-02-14 00:00:04,499 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.ComparebapmDQRecords - List of Applications in DQ to be marked \"Obsolete in bapm\": 
    [AD00007661, AD00007470, AD00007539, AD00007549, AD00007643]~
    2021-02-14 00:00:04,499 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.ComparebapmDQRecords - ## Total Application count from bapm ##1696~
    2021-02-14 00:00:04,499 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.ComparebapmDQRecords - ## Total Application Asset in DQ ##1692~
    2021-02-14 00:00:04,499 [[bapm2DQ].bapmprojectFlow.stage1.02] INFO  com.growl.hdt.dmt.DQ.bapm.ComparebapmDQRecords - ## No of Application to Obsolete in DQ ##5"
    | eval data=split(data,"~")
    | mvexpand data
    | eval _raw=data
    | fields - data
    ```Above just sets up test data```
    ```Use rex to extract fields```
    | rex "List of Applications in DQ to be marked \\\"Obsolete in bapm\\\":\s*(?<Obsolete_in_bapm>.*)"
    | rex "Total Application count from bapm ##(?<ApplicationCount>\d+)"
    | rex "Total Application Asset in DQ ##(?<Asset_in_DQ>\d+)"
    | rex "No of Application to Obsolete in DQ ##(?<Obsolete_in_DQ>\d+)"
    ```The stats commands groups the extracted fields into a single event```
    | stats values(*) as *