I have this file of 10,000+ lines of messages from a game server, like so:
11.07.23 08:40:16 [INFO] NC: Moving violation: wolfman98 from yasmp (-90.8, 64.0, 167.5) to (-90.7, 64.0, 167.3) distance (0.0, 0.0, 0.2)
11.07.23 10:57:44 [INFO] NC: Moving violation: AKxiZeroDark from yasmp (-1228.3, 11.2, 1098.7) to (-1228.3, 11.2, 1098.7) distance (0.0, 0.0, 0.0)
The current regex code I have is: \d{1,4}\.\d{1}
, which matches so far everything in bold:
11.07.23 08:40:16 [INFO] NC: Moving violation: wolfman98 from yasmp (-90.8, 64.0, 167.5) to (-90.7, 64.0, 167.3) distance (0.0, 0.0, 0.2)
I've been having trouble finding a way to get the part that only says:
(-1228.3, 11.2, 1098.7) to (-1228.3, 11.2, 1098.7)
before the "distance" word, and without the timestamp in the beginning, and eventually replacing it to end up like this:
11.07.23 08:40:16 [INFO] NC: Moving violation: wolfman98 from yasmp (-#, #, #) to (-#, #, #) distance (0.0, 0.0, 0.2)
11.07.23 10:57:44 [INFO] NC: Moving violation: AKxiZeroDark from yasmp (-#, #, #) to (-#, #, #) distance (0.0, 0.0, 0.0)
And a bit of extra information, the numbers can be either negative or not, ranging from 1.0 digit to 1234.0 digits, which is why I need help matching before the word "distance" again.
EDIT: Or even, it would be fine if the entire thing didn't show up:
11.07.23 08:40:16 [INFO] NC: Moving violation: wolfman98 from yasmp distance (0.0, 0.0, 0.2)
11.07.23 10:57:44 [INFO] NC: Moving violation: AKxiZeroDark from yasmp distance (0.0, 0.0, 0.0)
A fairly hairy looking regex that extends your number matching regex would be \((?:-?\d{1,4}\.\d{1}(?:, |\))){3} to \((?:-?\d{1,4}\.\d{1}(?:, |\))){3}(?= distance)
. Let's break that down a little.
It is made up of two groups that are identical to match the two groups of numbers in parens: \((?:-?\d{1,4}\.\d{1}(?:, |\))){3}
. The regex now allows an optional -
before the number and which makes the number match -?\d{1,4}\.\d{1}
. After each number there is either a comma or a paren, so to iterate the number match we need that as well: (?:, |\))
. That entire beast is then prefixed with \(
to get the opening paren of the number group. That regex is repeated twice to get the two groups of numbers with the to
match in-between.
The final bit is a positive look-ahead to ensure that we are matching the number groups that are followed by the word distance
. That word will not be included in the match, but will have to be there for the regex to match.
I've used non-capturing groups (the (?: ... )
stuff) because I don't know what you want to do with the captures.
I've tried this out against your two example logfile lines using perl 5.12.2 and it seems to work.