Search code examples
regexre2

Time complexity of google's re2 for a given length of regex pattern


I was looking into the google's re2 implementation, I want to do the regex match using regex given by some untrusted source. I have gone through the following resources -

With PCRE, we can get into some really bad cases and it can take exponential time to match. e.g. for regex (.*)+b and string like aaaaaaaaa... it can take O(c^n) time where n is length of string. Some more DOS expressions documented here.

I understand that the re2 will do the match in linear time for a given regex expression against a given length of a string. Now, my question is if I am able to compile a regex in re2, and I am putting a restriction on the length of a regex pattern then am I guaranteed that it won't run into some DOS cases mentioned in the above link or for some other regex pattern?


Solution

  • re2 has O(n) worst-case complexity for a given string of length n, unlike some others regex implementations that support backtracking/look aheads.

    So as long as both the length of string and length of regex pattern are capped, and you are validating the regex patterns beforehand, the re2 cannot be DOSed.

    But while migrating some patterns from PCRE or otherwise using it, you need to make sure that the regex pattern can be compiled as the implementation doesn't support many PCRE functionalities like backreferences.