regex postgresql sql-like string-matching

select rows if column contain name elements, any order

I have a table that contains a few thousands rows. These results have been produced by querying an api with a list of names.

For a search of "John Snow" the api would return all entries in their database which contained the token "john" and the token "snow". So I have a lot of false positives.

What I would like to do is to subset my table so that the I only have records where the column title (the one which contained the string matched against my search string) contains ALL elements of the searched string, regardless of the order.

I tried

select * from table where 'john snow' ~* title;

select * from table where 'john snow' ILIKE '%' || title || '%';

both work but only if the column title contains exactly john snow in that order.

Before

str_searched |  title
-------------+-------
john snow    | snow white
john snow    | john wick
john snow    | SNOW john
john snow    | john Snow
john snow    | Mr john snow

After

str_searched |  title
-------------+-------
john snow    | SNOW john
john snow    | john Snow
john snow    | Mr john snow

Solution

demo:db<>fiddle

SELECT
    *
FROM
    my_table
WHERE
    regexp_split_to_array(lower('john snow'), ' ') 
        <@ regexp_split_to_array(lower(title), ' ')

lower() normalizes the capital letters
regexp_split_to_array() tranforms the text into an array at the space
Now you have 2 arrays which can be compared (e.g. [john,snow] and [mr,john,snow]). The <@ comparator checks if the left array is complete contained by the right one.

Disclaimer: On really huge tables this might be slow. In that case I would recommend to take a look a the full text search engine: https://www.postgresql.org/docs/current/textsearch.html