I have a requirement to validate input using regex . Requirement is to match the string in form of tuples(a,b,c) or more than 3 (a,b,c,d,e) but white space can occur before/after string boundaries like below:
t1
t2,t1
t1 , t2
a
a,b
Following are invalid
a,
,
<empty>
I came with this regex:
(\s*(\w+\s*,\s*)*\s*\w+\s*)
The matching works fine but it has polynomial complexity for Attack string '\t'.repeat(1651) + '\t'.repeat(1651) + ',0'
I consider input as matching if the main group equals to input string. I mean I would reject inputs like a, although it matched subgroup.
Any suggestions to make it safe/linear. tried lookahead approach and lazy quantifiers but could not get it right?
Once I make this safe expression, end goal is to add a prefix/suffix and make it safe
something like
PREFIX (\s*(\w+\s*,\s*)*\s*\w+\s*) SUFFIX
I was trying something like this but it stops matching correct inputs
PREFIX(?=(?(\s*(\w+\s*,\s*)\s\w+\s*)))\k SUFFIX)
With above even correct inputs like below are also not matched
PREFIX a,b SUFFIX
Thanks..
You could update the pattern by starting to match 1+ word characters, and then optionally repeat a comma between optional whitespace chars and then 1+ word characters.
Note that using \s
can also match newlines.
PREFIX \w+(?:\s*,\s*\w+)* SUFFIX