Search code examples
sqljsonpostgresqlentity-attribute-value

How to groupBy in postgres with jsonb column to mimic an EAV count table?


I have a jsonb column that looks like this:

Id Data
1 {state: ["CA", "NY"], county:["Los Angeles"]}
2 {city: ["Kansas City"], zipCode: "12345"}
3 {state: ["CO, WA"], zipCode: "5212"}

But I used to have a data structure like so:

Id Attribute Value
1 state CA
1 state NY
2 city Kansas City

etc...

I used to just be able to write a simple query like this:

SELECT attribute, value, count(*)
    FROM table
    GROUP BY attribute, value;

and the output would yield:

Attribute Value Count
county New York County 11
city Kansas City 22
state CA 15
zip 100010 21
state NY 5

I'm trying to generate the same table above but with the jsonb table but I'm having trouble getting the desired output.

I've tried using jsonb_each_text like so:

with t1 as
    (select jsonb_each_text(facets) as rec from document_template_facets)
select (rec).key, sum((rec).value::int) from t1 group by (rec).key;

The problem is that it doesn't work for array types in my data like city, county, etc... Any way to get the arrays to be flattened in the query above to get the count to work?


Solution

  • The jsonb_each_text() function returns rows, so it has to be in the from part of your query. Do that with a cross join lateral.

    The below query returns what you want. The case within the jsonb_array_elements_text handles scalar values like the zipCode elements in your data by turning those into single-element arrays:

    with expand_keys as (
      select id, k, a
        from tab
             cross join lateral jsonb_each(data) as j(k, a)
    ), expand_arrays as (
      select id, k, a, el
        from expand_keys
             cross join lateral 
               jsonb_array_elements_text(
                 case jsonb_typeof(a)
                   when 'array' then a
                   else jsonb_build_array(a)
                 end
               ) as ar(el)
    )
    -- select * from expand_arrays; --run this, instead, to see interim results
    select k as attribute, el as value, count(*) as cnt
      from expand_arrays
     group by k, el;
    

    Fiddle here