Search code examples
postgresqlregexp-replace

Using regexp_replace how do t replace a string with an exception


How do I replace all occurrences of ' sub.*' with the exception of ' substation.*'?

regexp_replace("CleanString",' sub.*',' ', 'ig')

I have tried using various combinations of groupings () but still not getting it.

Using postgres regexp_replace()


Solution

  • A regular expression normally matches only things that are there, not things that are not there - you cannot simply put an "if-then-else" in there.

    However, Postgres's regex support, the manual page for which is here includes "lookahead" and "lookbehind" expressions.

    In your case, you want a *negative lookahead":

    (?!re) negative lookahead matches at any point where no substring matching re begins (AREs only)

    It's important to note the phrase "at any point" - lookarounds are "zero width", so (?!station) doesn't mean "something other than station", it means "a position in the string where station isn't coming next".

    You can therefore construct your query like this:

    ' sub(?!station).*'
    

    That will match any of "sub", "foo sub", " subbar", or "foo subbar", but not any of "substation", "foo substation", " substationbar", or "foo substationbar". Since the (?!station) is zero-width, and the next token is .*, it's fine for nothing to come after " sub".

    If you want there to be something after the "sub", you could instead write:

    ' sub(?!station).+'
    

    The .+ means "at least one of something", so it will still match " subbar" and "foo subbar", but will no longer match " sub" or "foo sub".