Search code examples
regexperlcode-injection

How can I safely validate an untrusted regex in Perl?


This answer explains that to validate an arbitrary regular expression, one simply uses eval:

while (<>) {
    eval "qr/$_/;"
    print $@ ? "Not a valid regex: $@\n" : "That regex looks valid\n";
}

However, this strikes me as very unsafe, for what I hope are obvious reasons. Someone could input, say:

foo/; system('rm -rf /'); qr/

or whatever devious scheme they can devise.

The natural way to prevent such things is to escape special characters, but if I escape too many characters, I severely limit the usefulness of the regex in the first place. A strong argument can be made, I believe, that at least []{}()/-,.*?^$! and white space characters ought to be permitted (and probably others), un-escaped, in a user regex interface, for the regexes to have minimal usefulness.

Is it possible to secure myself from regex injection, without limiting the usefulness of the regex language?


Solution

  • The solution is simply to change

    eval("qr/$_/")
    

    to

    eval("qr/\$_/")
    

    This can be written more clearly as follows:

    eval('qr/$_/')
    

    But that's still not optimal. The following would be far better as it doesn't involve generating and compiling Perl code at run-time:

    eval { qr/$_/ }
    

    Note that neither solution protects you from denial of service attacks. It's quite easy to write a pattern that will take longer than the life of the universe to complete. To hand that situation, you could execute the regex match in a child for which CPU ulimit has been set.