Getting the max date per each Key per user

We have the following table

   userid        transaction_dates
    A            {'LEGO': '2022-05-03', 'XBOX': '2023-01-02'}
    A            {'LEGO': '2022-06-01', 'PS': '2023-04-01', 'XBOX': '2022-01-01'}
    B            {'SWITCH': '2021-05-01', 'APPLE': '2023-05-01'}

and so on.

What I want to do is that for per each userid, for each transaction item, get the most recent transaction date.

In the above, it should be

   userid        transaction_dates
    A            {'LEGO': '2022-06-01', 'XBOX': '2023-01-02', 'PS': '2023-04-01'}
    B            {'SWITCH': '2021-05-01', 'APPLE': '2023-05-01'}

Is there a way to do it in Presto SQL? If it's just a sum, I can use map_union_sum, but I couldn't find a presto sql function that does it.

The only thing I could think about is to explode the map column, and compute the max value grouping by the userid, but wasn't sure if there are more efficient ways.

Any help would be greatly appreciated.

Thanks.

Solution

Assuming that data is a map (otherwise you will need to do some json parsing and casting to map) then you can use unnest to flatten the data and then perform several groupings:

-- sample data
WITH dataset (userid, transaction_dates) AS (
    values ('A', map(array['LEGO', 'XBOX'], array['2022-05-03','2023-01-02'])),
    ('A', map(array['LEGO', 'XBOX','PS'], array['2022-06-01', '2023-04-01','2022-01-01'])),
    ('B', map(array['SWITCH', 'APPLE'], array['2021-05-01', '2023-05-01']))
)

-- query
select userid, map_agg(k, v) transaction_dates
from (select userid, k, max(v) v
      from dataset,
           unnest(transaction_dates) as t (k, v)
      group by userid, k)
group by userid;

Output:

userid	transaction_dates
B	{SWITCH=2021-05-01, APPLE=2023-05-01}
A	{LEGO=2022-06-01, XBOX=2023-04-01, PS=2022-01-01}