Search code examples
regexpcre

Replace multiple occurences of character after zero-length assertion


I would like to replace every _ with a - on lines starting with #| label: using PCRE2 regex within my text editor.

Example:

#| label: my_chunk_label
my_function_name <- function(x)

Should become:

#| label: my-chunk-label
my_function_name <- function(x)

In contrast to .NET regex, where one could substitute (?<=^#\| label: .+)_ with - (regex101 example), PCRE2 does not support infinite lookbehind so the regex is invalid. So far, the only way I found was to repeatedly substitute ^#[^_]+\K_ with - (regex101 example) but I was curious if there is a single-pass solution.


Solution

  • If you are using pcre, you could make use of \G and \K

    Then in the replacement use -

    (?:^#\|\h+label:\h+|\G(?!^))[^\r\n_]*\K_
    

    The pattern matches:

    • (?: Non capture group for the alternatives
      • ^#\|\h+label:\h+ Match the pattern that should be at the start of the string, where \h matches a horizontal whitespace character
      • | Or
      • \G(?!^) Assert the current position at the end of the previous match, not at the start
    • ) Close the non capture group
    • [^\r\n_]* Match optional characters except for newlines or _
    • \K Forget what is matched so far
    • _ Match the underscore

    Regex demo