Search code examples
regexpcreregex-lookaroundslookbehind

When to use positive lookarounds in Regex?


Can someone explain to me, why/when I should use positive lookarounds in Regex? For negative lookarounds I can think of scenarios where they are the only solution, but for positive lookarounds I don't see why to use them, when their result can also be produced by using capture groups.

For example:

Input: Bus: red, Car: blue

I want to color of the car.

With lookaround: (?<=Car: )\w+ With capture group: Car: (\w+)

Both Regex archive the same result - direct access to the color-match. So are there cases which can only be solved by positive lookarounds?


Solution

  • PCRE is used not only in PHP, the library is used in a variety of tools and languages, where you do not always have easy access to captured groups.

    In some of them, a lookbehind is the easiest way to, say, split a string (with strsplit in R), or work around the problems with accessing submatches.

    PCRE lookbehind is "crippled" in a way, that is, it is fixed-width, and is thus really not that full-fledged. However, here is an interesting case: a positive lookbehind is used after the match increasing performance: \d{3}(?<=USD\d{3}). Here, the check only starts after we matched 3 digits, no need to check U, then S, then D, then digits.

    As for a positive lookahead, it is used in a lot of scenarios:

    • Set conditions on the string matched (see Dmitry's answer, also e.g. ^(?=.*\d) will require at least 1 digit in the string)
    • Overlapping matches are possible (e.g. -\d+(?=-|$) will find 3 matches in -1-2-3)