Search code examples
javascriptregexbacktracking

Regex in Javascript for csv kind of string matching


I have a requirement to validate input using regex . Requirement is to match the string in form of tuples(a,b,c) or more than 3 (a,b,c,d,e) but white space can occur before/after string boundaries like below:

t1
t2,t1
 t1 , t2 
 a    
 a,b

Following are invalid

a,
,
<empty>

I came with this regex:

(\s*(\w+\s*,\s*)*\s*\w+\s*)

The matching works fine but it has polynomial complexity for Attack string '\t'.repeat(1651) + '\t'.repeat(1651) + ',0'

I consider input as matching if the main group equals to input string. I mean I would reject inputs like a, although it matched subgroup.

Any suggestions to make it safe/linear. tried lookahead approach and lazy quantifiers but could not get it right?

Once I make this safe expression, end goal is to add a prefix/suffix and make it safe

something like

    PREFIX (\s*(\w+\s*,\s*)*\s*\w+\s*) SUFFIX

I was trying something like this but it stops matching correct inputs

PREFIX(?=(?(\s*(\w+\s*,\s*)\s\w+\s*)))\k SUFFIX)

With above even correct inputs like below are also not matched

PREFIX a,b SUFFIX

Thanks..


Solution

  • You could update the pattern by starting to match 1+ word characters, and then optionally repeat a comma between optional whitespace chars and then 1+ word characters.

    Note that using \s can also match newlines.

    PREFIX \w+(?:\s*,\s*\w+)* SUFFIX