What is the proper code using TRegExpr (or anything similar) to extract the WHERE clause of an SQL select statement, such that it stops at 'group by
' if present, or at 'order by
' if present, or at end of statement if both 'group by
' and 'order by
' are not present?
Input string:
select p.PRODUCT_TYPE_ID, p.PRODUCT_TYPE_NAME
from PRODUCT_TYPE
where p.PRODUCT_TYPE_NAME containing :PRODUCT_TYPE_NAME
group by whatever
order by p.PRODUCT_TYPE_NAME
Regular expression tried:
where[\s]+(.+)(\s+group\s+by\s+|\s+order\s+by\s+|)
This regex works in the online parsers I tried, but not in this Pascal code:
Program HelloWorld(output);
uses
regexpr, classes;
var
re: tregexpr;
s: string;
begin
s := 'select p.PRODUCT_TYPE_ID, p.PRODUCT_TYPE_NAME '+
'from PRODUCT_TYPE '+
'where p.PRODUCT_TYPE_NAME containing :PRODUCT_TYPE_NAME '+
'order by whatever '+
'group by whatever';
re := tregexpr.create;
re.ModifierStr := '-i';
re.expression := 'where[\s]+(.+)(\s+group\s+by\s+|\s+order\s+by\s+|)';
re.inputstring := s;
re.Exec(s);
writeln('match[1]');
writeln(re.match[1]);
re.free;
end.
I tried converting the second group to non-capturing with ?:
to no avail.
where[\s]+(.+)(\s+group\s+by\s+|\s+order\s+by\s+|)
Documentation says non-capturing group is supported, but the FAQ contradicts it later on, or I missed something.
I was expecting match [1] to return p.PRODUTCT_TYPE_NAME containing : PRODUCT_TYPE_NAME
I would try this:
re.expression := 'where\s+(.+?)(\s+group\s+by\s+|\s+order\s+by\s+|\s+$)';
I replaced +
with +?
in the first group. It's "lazy" capture mode.
I replaced |)
with |\s+$)
in the second group. $
means "the end of the string".
I can't be sure that it would work in all cases where you might want to use it, but it works with the sample that you provided.
By the way, these two lines are not needed:
re.modifierstr := '-i';
re.inputstring := s;