Search code examples
javascriptregexjoi

Joi validation breaking for special character


I want to restrict content inside a textbox to 250 words. I'm using joi validation for it. It should count all characters (including special characters) and allow only 250 words. But I'm facing the following problems.

  1. The moment the first special character appears even if it is after 3 words, I can get joi validation error saying "This section must contain no more than 250 words".

  2. When I copy-paste content from PDF into it, my screen freezes. So I think there is something wrong with my joi schema.

description1: Joi.string().regex(/^(([\w\s,."'()-]+)\b[\s,.]*){0,250}$/).options({ language: { string: { regex: { base: 'This section must contain no more than 250 words' } } } }).label("this section"),

Could someone help me?


Solution

  • This might work:

    /^\s*(?:\S+\s+){0,249}\S*\s*$/
    

    Unlike \w, in your original regex which matches only [a-zA-Z0-9_] the special token \S matches any non-white space character. Because the character sets \S and \s are entirely distinct this should avoid any issues with catastrophic backtracking.

    Explanation:

    • \s* 0 or more spaces at the start. These are not counted at all.
    • (\S+\s+) a word, consisting of 1 or more non-white space characters followed by 1 or more white-space characters.
    • {0,249} Repeat 249 times at most
    • \S* optionally a extra word at the end, which should not end with a space. This is the 250th word, this is why the line above needs to be 249 not 250.

    This can backtrack only the length of the last word, so could be slow if the last word is very very long. However, the growth can't be exponential so it shouldn't crash Joi.