I'm working with regex on PRCE2 environment.
In my switch logs I have to capture a text string that I'm capturing as "message"
and that is located in a specific position. The focus point is that it is always preceded by a set of characters ending with :
but, after them, I can have or not some addictional characters ending with ;
and I must be able to skip them.
Let me explain with my current regex and some log samples.
We can say that I have 3 chances:
1. (s)[18014]:Recorded command information.
2. (l):User logged out.
3. (s)[18014]:CID=0x11aa2222;The user succeeded in logging out of XXX.
My current regex is:
\(\w+\)\[*\d*\]*\:(?<message>[^\[]+?\.)
that works for case 1 and 2 because:
\(\w+\)
\[*\d*\]*
:
and I capture it with \:
(?<message>[^\[]+?\.)
that must avoid the capturing action if, after :
, I have a [
. The capture stops when when I get a .
My problem is: after the :
I can have the case 3; it always begin with CID=<exadecimal expression>;
but it is not only limited to this. After it, I can have other expression always ended by ;
So we can say that I can have, for case 3, CID=<hex expression><other numeric and literal characters>;
.
With current regex, of course, the CIDR
part is included in the message. I must avoid it; if the CIDR
part is present, the message capture must start after the ;
that end it.
So, we can summarize that:
IF after the :
we have no CIDR word, starts capturing; ELSE, avoid capturing until ;
and start the job after it.
The following pattern will match the right part of your test strings.
We look for either a :
not followed by CID ?!CID
or a ;
. We then capture what follows.
((:(?!CID))|;)(.*)