I have a jsonb column that looks like this:
Id | Data |
---|---|
1 | {state: ["CA", "NY"], county:["Los Angeles"]} |
2 | {city: ["Kansas City"], zipCode: "12345"} |
3 | {state: ["CO, WA"], zipCode: "5212"} |
But I used to have a data structure like so:
Id | Attribute | Value |
---|---|---|
1 | state | CA |
1 | state | NY |
2 | city | Kansas City |
etc...
I used to just be able to write a simple query like this:
SELECT attribute, value, count(*)
FROM table
GROUP BY attribute, value;
and the output would yield:
Attribute | Value | Count |
---|---|---|
county | New York County | 11 |
city | Kansas City | 22 |
state | CA | 15 |
zip | 100010 | 21 |
state | NY | 5 |
I'm trying to generate the same table above but with the jsonb table but I'm having trouble getting the desired output.
I've tried using jsonb_each_text like so:
with t1 as
(select jsonb_each_text(facets) as rec from document_template_facets)
select (rec).key, sum((rec).value::int) from t1 group by (rec).key;
The problem is that it doesn't work for array types in my data like city, county, etc... Any way to get the arrays to be flattened in the query above to get the count to work?
The jsonb_each_text()
function returns rows, so it has to be in the from
part of your query. Do that with a cross join lateral
.
The below query returns what you want. The case
within the jsonb_array_elements_text
handles scalar values like the zipCode
elements in your data by turning those into single-element arrays:
with expand_keys as (
select id, k, a
from tab
cross join lateral jsonb_each(data) as j(k, a)
), expand_arrays as (
select id, k, a, el
from expand_keys
cross join lateral
jsonb_array_elements_text(
case jsonb_typeof(a)
when 'array' then a
else jsonb_build_array(a)
end
) as ar(el)
)
-- select * from expand_arrays; --run this, instead, to see interim results
select k as attribute, el as value, count(*) as cnt
from expand_arrays
group by k, el;