I was wondering if someone can help me with a parsing problem. I've been working on parsing a particular log where I'm using controlled groups (Description, FooBar, etc.). Parsing this log has been a big challenge.
The log file looks like this:
2021-02-10T09:0022.041-05:00 | Info | TransactionGUID=yyyy1234-12a1-1a99-1234-01ab1ab12abc | TransactionID=123456 | Saving uploaded file to shared folder \\foobar\foo\fil\ENV1\ABMylocingZone\TIMS\FileTemplates\12345678_12345678_01ab1ab12abc-99f5-4a43-9127-01ab1ab12abc.xlsx | CopyToSharedFolder()
I need to place this set of text:
Saving uploaded file to shared folder \foobar\foo\fil\ENV1\ABMylocingZone\TIMS\FileTemplates\12345678_12345678_01ab1ab12abc-99f5-4a43-9127-01ab1ab12abc.xlsx | CopyToSharedFolder()
into the "Description" capture group.
I need to place this set of text:
12345678
in the "FooBar" capture group.
Below is what I was able to come up with thus far. If I try to add the FooBar control group (omitted from below rule), I lose part of the Description controlled group. Because of the application I'm working with, I need to use the GROK Debugger to create and debug my rule:
[A-Za-z0-9]{0,7}%{SPACE}%{TIMESTAMP_ISO8601:dateTime}%{SPACE}\|%{SPACE}%{LOGLEVEL:Loglevel}%{SPACE}\|%{SPACE}TransactionGUID=%{UUID:GUID}%{SPACE}\|%{SPACE}TransactionID=%{NUMBER:TransactionId}%{SPACE}\|%{SPACE}(?<Description>(?<=\|\s).*(?=\)?))
Short version:
This message...
MyGroup12345679ContainsInfo
... gets captured by the message
group, and has the number it contains captured by the hidden_message
group.
(?<message>[a-zA-Z]+(?<hidden_message>%{NUMBER})[a-zA-Z]+)
Complete version:
As for your exact log, I would parse it this way : (had to replace UUID
with NUMBER
for testing purpose)
grok {
message => [
"%{TIMESTAMP_ISO8601:dateTime}%{SPACE}\|%{SPACE}%{LOGLEVEL:Loglevel}%{SPACE}\|%{SPACE}TransactionGUID=%{NUMBER:GUID}%{SPACE}\|%{SPACE}TransactionID=%{NUMBER:TransactionId}%{SPACE}\|%{SPACE}(?<Description>.*(\\(?<FooBar>[0-9]+)_[^\\]+\.[a-zA-Z0-9]+).*)",
"+ the pattern you are using now, unless there is always a path to match there"
]
}
Tested log:
2021-02-10T09:0022.041-05:00 | Info | TransactionGUID=82147 | TransactionID=123456 | Saving uploaded file to shared folder \\foobar\foo\fil\ENV1\ABMylocingZone\TIMS\FileTemplates\12345678_12345678_01ab1ab12abc-99f5-4a43-9127-01ab1ab12abc.xlsx | CopyToSharedFolder()
The description
part explained :
.* # greedily consumes characters
( # matches a filename beginning with a number :
\\(?<FooBar>[0-9]+) # one "\", a number,
_[^\\]+ # one _, anything but a "\" once or more
\.[a-zA-Z0-9]+ # a file extension
)
.* # the rest of it