Search code examples
c++regexqtqregexp

Qregexp quantifier capture


Here is my problem
i use QRegexp in the QT library to do a capture of a prolog expression like this

ma(v,c).

with this regular expression

([a-z][A-Za-z0-9]*)(\\()([A-Za-z0-9]*,)*([A-Za-z0-9]*)(\\))(\\.)

in this case it gives me

"ma"   "("   "v"  ","   "c"   ")"    "." 

but when i try this prolog sentence

ma(v,c,r).

it only gives me

"ma"   "("   "c"  ","   "r"   ")"    "." 

Is there a way to capture all the quantifiers ?


Solution

  • In Regex, the groups are defined by the user, not by the expression to be parsed. I mean that the count of the groups will not increase or decrease depending on the data to be analyzed.

    When you put ([A-Za-z0-9]*,)*, it doesn't means that regex will return zero or many groups that match this pattern [A-Za-z0-9]*,.

    My suggestion is divide it into three operations. One for parse the whole structure, an another to split this v,c,r by using "," and one to put everything togheter:

    Operation 1 - use this regex: ([a-z][A-Za-z0-9]*)(\\()((?:[A-Za-z0-9]*,?)*)(\\))(\\.)

    Operation 2 - split the string at group(3) by using , to get each element.

    Operation 3 - concatenate: group(1) + group(2) + result_of_operation_2 + group(4) + group(5)