Search code examples
regexperformanceregex-lookaroundsregex-greedy

RegEx for matching the last line


I'm configing a RegEx (so it won't be that easy to switch into raw code for this requirement), to get the last line of input, if I use /.*$/ it gets quite slow for some inputs, e.g. js'1'.repeat(1e6)+'\n2'. Is there fast way to get the last line?

Also, if it's not a good idea to use RegEx as a matching config, is there better suggestions?


Solution

  • An optimized expression for finding the final line of a large input string would be the one that introduces explicit boundaries:

    (?m)^.*\z
    

    In languages like PHP it would be written as /^.*\z/m (/s are delimiters and m is multiline flag). The caret ^ makes engine not to go through .* (evil) regex if it is not matched. So we have defined a very well known boundary, not only for us to recognize the desired part but also for engines and their builtin optimizations.

    The performance of this regex depends on number of lines of input string. So an input string like yours isn't a problem at all but something like this would bring some attention.

    In both cases it performs fast and doesn't come to a failure.