Search code examples
c++regexboost-regex

Join two regular expression results into one output field, when only one is possible at a time


I'm parsing chat output to get the user name. This is what outputs may look like:

<Darker> MC_BOT sethome
(From Darker): MC_BOT exit

I need to match username and the command contents in the string. Taking these data from both strings is easy:

^(<([A-Za-z0-9_]+)>|\\(From ([A-Za-z0-9_]+)\\):) MC_BOT ([a-z]+)( [a-zA-Z0-9 ]+)?$
  |<Darker>         |(From Darker):                     |sethome

Problem is, that for <Darker> output field 2 is used, but for (From Darker) the parser uses field 3.

<Darker> MC_BOT command parameters
   1: <Darker>
   2: Darker  - field 2!
   3: 
   4: command
   5:  parameters


(From Darker): MC_BOT command parameters
   1: (From Darker):
   2: 
   3: Darker  - field 3!
   4: command
   5: parameters   

So how should I write this regexp to make it contain username in the same field? Also, can I make regexp ignore the (...|...)? I only need to match the username, not the <username> or (From username):.


Solution

  • Boost appears to support branch reset. So you could use something like:

    ^(?|<([A-Za-z0-9_]+)>|\(From ([A-Za-z0-9_]+)\):) MC_BOT ([a-z]+)( [a-zA-Z0-9 ]+)?$
     ^   ^                       ^                          ^       ^
     |    \ group 1               \ also group 1             \ g.2   \ group 3
     |
     \ branch reset