Search code examples
regexstringpostgresqlwhere-clause

How to compare two columns by ignoring special charactes?


I am comparing two columns from different tables to get the matching records. Those tables do not have any unique key other than first and last names. But I don't get the correct output if tableA has Aa'aa and tableB has Aaaa. Could any one advise how to compare by ignoring the special characters / any other alternate solution to get them matched?

   SELECT * FROM TableA  A where EXISTS
    (SELECT '' FROM TableB B
                WHERE 
                TRIM(A.namef) = TRIM(B.namef)
                AND TRIM(A.namel) = TRIM(B.namel)
    )

Solution

  • You could try a regex approach. Assuming that you want to compare only the alphabetic and numeric characters, you can do:

    where 
            regexp_replace(a.namef, '\W', '', 'g') = regexp_replace(b.namef, '\W', '', 'g')
        and regexp_replace(a.namel, '\W', '', 'g') = regexp_replace(b.namel, '\W', '', 'g')
    

    Basically this removes non-word characters from each string before comparing them - with a word character being defined as a letter or a digit, plus the underscore character.