Search code examples
postgresqlregex-negationregexp-replace

postgres regexp_replace negate presence of consecutive groups of chars


i have a database export that come with some wrong chararacter substitution(like è => e'). So i have to change it back in postgres.
I'm going crazy to make an all in one regex that catch something like this:

cassine' de' pecche' e'

it must become

cassinè de' pecchè è

(de' must not change).

I managet to get it with two pass:

UPDATE mytable SET comun1=UPDATE mytable SET comun1=regexp_replace(column1,'([^dnNn])(e\'')', '\1\2è', 'g');   

UPDATE mytable SET comun1=UPDATE mytable SET comun1=regexp_replace(column1,'([^\s])([dnNn])(e\'')', '\1\2è', 'g'); 

Basically i want to exclude from replace a space followed by d or n and followed by e' (like " de'") and change the e' in all other cases.
I tried (?!\s[nNdD])(e\'') but it still changes " de'" to " dè"

Does anyone have a solution for this?


Solution

  • select regexp_replace($$cassine' de' pecche' e'$$, $$(\s[^dn]?|\w\w)e'$$, '\1è', 'gi')
        regexp_replace    
    ----------------------
     cassinè de' pecchè è
    (1 row)
    

    Explanation:

    (\s[^dn]?|\w\w)e'
     ^       ^     ^ 
     |       |     followed by e'
     |       or 2 word chars
     space optionally followed by d or n