I have some logs in Splunk for which I'm trying to extract a few values. My log entries look like this:
host-03.company.local:9011[read 3617, write 120 bytes] host-05.company.local:9011[read 370658827, write 177471 bytes] host-07.company.local:9011[read 99, write 96 bytes] host-07.company.local:9011[read 96, write 96 bytes] host-05.company.local:9011[read 120, write 120 bytes] host-05.company.local:9011[read 120, write 120 bytes] host-03.company.local:9015[read 42955, write 120 bytes] host-05.company.local:9015[read 3048879, write 86677386 bytes] host-02.company.local:7035[read 120, write 120 bytes] host-03.company.local:9015[read 120, write 120 bytes] host-05.company.local:9015[read 809077, write 120 bytes] host-02.company.local:7035[read 120, write 120 bytes] host-03.company.local:9015[read 120, write 120 bytes] host-05.company.local:9015[read 120, write 120 bytes] host-02.company.local:7035[read 120, write 120 bytes]
The pattern these log entries follow is host:port[read xxx, write yyy bytes]
There can be anywhere from 1 to about 20 host records in this log line.
What I'm hoping to do, in Splunk, is extract these fields to a table, such that the result looks like:
hostname readBytes WriteBytes
-----------------------------------------------
host-03.company.local:9011 3617 120
host-05.company.local:9011 370658827 177471
host-07.company.local:9011 99 96
host-05.company.local:9011 120 120
Logic here being that I'm extracting the read
and write
entries for each host, such that each one becomes a line in this table.
I've made some progress in extracting the hosts, with the rex:
index=myApplication <mySearch>
| rex field=_raw "(?<hostsTmp>([a-zA-Z0-9\-\.]+:[0-9]+))"
| table hostsTmp
However, even this result seems wrong, some of the results are just blank lines. In addition, the hostsTemp
field doesn't seem to be a multi-variable field. mvcount(hostsTemp) returns nothing for each entry.
mvcount(hostsTmp) len(hostsTmp) hostsTmp
--------------------------------------------
- - host-05.company.local:9011
- - -
- - host-05.company.local:9011
- - -
- - host-05.company.local:9011
- - -
Note that I'm using the -
character here to represent a lack of data in my table. Every other line is just completely blank, and the mvcount
and len
values for hostsTmp is always empty.
Relatively new to Splunk and not an expert in regex, so any help is appreciated.
What I suggest is to split each of your host's results into a separate event and then do a rex
on each event.
This will take your full event and create a multi value field (named ev
) for each of your hosts and their data.
| eval ev=split(raw,"]")
| mvexpand ev
Then, a simple rex
can be used to extract the data.
| rex field=ev "^\s*(?<hostname>[^\[]+)\[read\s+(?<readBytes>\d+),\s+write\s+(?<writeBytes>\d+)\s+bytes"
And use table
to format it appropriately.
| table hostname readBytes writeBytes
Here is an example showing it working. You will probably need to change the split(raw
to point to the field in your own event, or use _raw
.
| makeresults | eval raw="host-03.company.local:9011[read 3617, write 120 bytes] host-05.company.local:9011[read 370658827, write 177471 bytes] host-07.company.local:9011[read 99, write 96 bytes] host-07.company.local:9011[read 96, write 96 bytes] host-05.company.local:9011[read 120, write 120 bytes] host-05.company.local:9011[read 120, write 120 bytes] host-03.company.local:9015[read 42955, write 120 bytes] host-05.company.local:9015[read 3048879, write 86677386 bytes] host-02.company.local:7035[read 120, write 120 bytes] host-03.company.local:9015[read 120, write 120 bytes] host-05.company.local:9015[read 809077, write 120 bytes] host-02.company.local:7035[read 120, write 120 bytes] host-03.company.local:9015[read 120, write 120 bytes] host-05.company.local:9015[read 120, write 120 bytes] host-02.company.local:7035[read 120, write 120 bytes]"
| eval ev=split(raw,"]")
| mvexpand ev
| rex field=ev "^\s*(?<hostname>[^\[]+)\[read\s+(?<readBytes>\d+),\s+write\s+(?<writeBytes>\d+)\s+bytes"
| table hostname readBytes writeBytes