Search code examples
sqlarrayspostgresqlaggregate-functionspostgresql-10

Postgres - How to find a row with the biggest intersection of a certain column


I have a table called Protocols which holds a column called keyWords which is of type TEXT[]. Given an array of strings, how do I get the row with the biggest intersection of the keyWords column and the given array?


Solution

  • Use the function (that can be useful elsewhere as well):

    create or replace function array_intersect(anyarray, anyarray)
    returns anyarray language sql as $function$
        select case 
            when $1 is null then $2
            else
                array(
                    select unnest($1)
                    intersect
                    select unnest($2)
                )
            end;
    $function$;
    

    Query:

    with cte as (
        select 
            id, keywords,
            cardinality(array_intersect(keywords, '{a,b,d}')) as common_elements
        from protocols
        )
    select * 
    from cte
    where common_elements = (select max(common_elements) from cte)
    

    DbFiddle.

    In case you do not like the function:

    with cte as (
        select id, count(keyword) as common_elements
        from protocols
        cross join unnest(keywords) as keyword
        where keyword = any('{a,b,d}')
        group by 1
        )
    select id, keywords, common_elements
    from cte
    join protocols using(id)
    where common_elements = (select max(common_elements) from cte);
    

    DbFiddle.