Search code examples
regexpostgresqlsql-likeregexp-like

REGEXP_LIKE in Postgresql


I have a table to store substitutions which is including two fields, the first one is to store the word, the second is to store the substitutions. I know the creation of the table is not a suitable approach, but it is already in place and used by other systems.

The table look like the following:

WORD        SUBS_LIST
------------------------------------
MOUNTAIN    MOUNTAIN, MOUNT, MT, MTN
VIEW        VIEW, VU
FIFTH       V, 5TH
YOU         EWE, U , YEW
ROW         ROW , ROE
ONE         UN , ONE

Then, when a name comes in it is substitute according to the table. I was able to the previous on Oracle using regexp_like. However, I would like to apply the same in Postgresql. I have attempt using ~ to replace regexp_like and regexp_matches without success.

Please find here the DB<>Fiddle which I have tried so far.

Thank you for your help :)


Solution

  • You don't need regexp for that. If I understood you correctly you want to input a word, search the elements in sub_list and return the word column for that. This is best done by converting the (ugly) comma separated list to an array, then use the ANY operator:

    select word
    from the_table
    where 'mount' = any(string_to_array(subs_list, ','));
    

    The above would deal properly with the whitespace you have around the , - not sure if that is a result of your formatting or if you really store the list that way. If you do need to deal with whitespaces, you can use the following:

    select word
    from the_table
    where exists (select *
                  from unnest(string_to_array(subs_list, ',')) as x(subs)  
                  where trim(x.subs) = 'mount');
    

    If your input is a list of words, you can use regexp_split_to_table() to turn the input words into rows and join to the substitutions.

    SELECT w.input, coalesce(x.word, w.input) as word
    FROM regexp_split_to_table('MOUNT VU FOOD CAFE', '\s') as w(input) 
      LEFT JOIN (
        select s.word, trim(s1.token) as token
        from subs s
          cross join unnest(string_to_array(s.subs_list, ',')) s1(token)
      ) as x on lower(trim(w.input)) = lower(x.token)
    ;
    

    Online example: https://rextester.com/DZBF77100