Search code examples
regexdelphidelphi-xeregexbuddy

Regex named capture groups in Delphi XE


I have built a match pattern in RegexBuddy which behaves exactly as I expect. But I cannot transfer this to Delphi XE, at least when using the latest built in TRegEx or TPerlRegEx.

My real world code have 6 capture group but I can illustrate the problem in an easier example. This code gives "3" in first dialog and then raises an exception (-7 index out of bounds) when executing the second dialog.

var
  Regex: TRegEx;
  M: TMatch;
begin
  Regex := TRegEx.Create('(?P<time>\d{1,2}:\d{1,2})(?P<judge>.{1,3})');
  M := Regex.Match('00:00  X1 90  55KENNY BENNY');
  ShowMessage(IntToStr(M.Groups.Count));
  ShowMessage(M.Groups['time'].Value);
end;

But if I use only one capture group

Regex := TRegEx.Create('(?P<time>\d{1,2}:\d{1,2})');

The first dialog shows "2" and the second dialog will show the time "00:00" as expected.

However this would be a bit limiting if only one named capture group was allowed, but thats not the case... If I change the capture group name to for example "atime".

var
  Regex: TRegEx;
  M: TMatch;
begin
  Regex := TRegEx.Create('(?P<atime>\d{1,2}:\d{1,2})(?P<judge>.{1,3})');
  M := Regex.Match('00:00  X1 90  55KENNY BENNY');
  ShowMessage(IntToStr(M.Groups.Count));
  ShowMessage(M.Groups['atime'].Value);
end;

I'll get "3" and "00:00", just as expected. Is there reserved words I cannot use? I don't think so because in my real example I've tried completely random names. I just cannot figure out what causes this behaviour.


Solution

  • When pcre_get_stringnumber does not find the name, PCRE_ERROR_NOSUBSTRING is returned.

    PCRE_ERROR_NOSUBSTRING is defined in RegularExpressionsAPI as PCRE_ERROR_NOSUBSTRING = -7.

    Some testing shows that pcre_get_stringnumber returns PCRE_ERROR_NOSUBSTRING for every name that has the first letter in the range of k to z and that range is dependent of the first letter in judge. Changing judge to something else changes the range.

    As i see it there is at lest two bugs involved here. One in pcre_get_stringnumber and one in TGroupCollection.GetItem that needs to raise a proper exception instead of SRegExIndexOutOfBounds