Search code examples
sqlpostgresqlmultidimensional-arraymatchmetabase

PostgreSQL: Without using a function, how can nested arrays be compared?


I am using PostgreSQL in Metabase (I have read-only rights so functions won't work)

I have the following nested data in two columns:

Categories_A    Categories_B
{"A"}           {"B","F","C"}
{"B","A"}       {"Z","B","F"}
{"B","F"}       {"A","E","R"}

I would like to return in a new column the missing category/categories in categories_B when compared to categories_A - what is in Categories_A that is not in categories_B. So ideally:

Categories_A    Categories_B      Missing_Category
{"A"}           {"B","F","C"}     {"A"}
{"B","A"}       {"Z","B","F"}     {"A"}
{"B","F"}       {"A","E","R"}     {"B","F"}

This is the code I have but it doesn't work due to the "read-only" rights I have in Metabase.

create function array_except(p_one anyarray, p_two anyarray)
  returns anyarray
as
$$
  select array_agg(e)
  from (
    select e
    from unnest(p_one)
    except
    select e
    from unnest(p_two)
  )
$$
language plpgsql
immutable
;

select categories_a, categories_b, 
       array_except(categories_a, categories_b) as missing_categories
from my_table

How can I achieve this without a function?


Solution

  • Since there is no array_except function or operator in PostgreSQL, you have to unpack the array in your query:

    SELECT id,
           CASE missing_categories
                WHEN ARRAY[NULL]::text[]
                THEN ARRAY[]::text[]
                ELSE missing_categories
           END
    FROM (SELECT tab.id,
                 array_agg(arr.elem) AS missing_categories
          FROM tab
             LEFT JOIN LATERAL unnest(tab.categories_a) AS arr(elem)
                ON NOT tab.categories_b @> arr.elem
          GROUP BY tab.id) AS q;
    

    The outer query only replaces an array with a NULL element that occurs because of the outer join with an empty array.