I have two tables in my PostgreSQL 9.6 instance.
users
+----+------------+-----------+-------------------+
| id | first_name | last_name | email |
+----+------------+-----------+-------------------+
| 1 | John | Doe | [email protected] |
+----+------------+-----------+-------------------+
| 2 | Jane | Doe | [email protected] |
+----+------------+-----------+-------------------+
| 3 | Mike | Doe | [email protected] |
+----+------------+-----------+-------------------+
surveys
+----+---------+----------------------------------------------------------------------------------------------------+
| id | user_id | survey_data |
+----+---------+----------------------------------------------------------------------------------------------------+
| 1 | 1 | {'child_list': [{'gender': 1, 'birthday': '2015-10-01'}, {'gender': 2, 'birthday': '2017-05-01'}]} |
+----+---------+----------------------------------------------------------------------------------------------------+
| 2 | 2 | {'child_list': []} |
+----+---------+----------------------------------------------------------------------------------------------------+
| 3 | 3 | {'child_list': [{'gender': 2, 'birthday': '2008-01-01'}]} |
+----+---------+----------------------------------------------------------------------------------------------------+
I would like be able to query these two tables to get the number of users who have children between certain age. The survey_data
column in surveys
table is a JSONB column.
So far I've tried using jsonb_populate_recordset
with LATERAL
joins. I was able to SELECT
the child_list
array as two columns but couldn't figure out how to use that with my JOIN
between users
and surveys
tables. The query I used is as below:
SELECT DISTINCT u.email
FROM surveys
CROSS JOIN LATERAL (
SELECT *
FROM jsonb_populate_recordset(null::json_type, (survey.survey_data->>'child_list')::jsonb) AS d
) d
INNER JOIN users u ON u.id = survey.user_id
WHERE d.birthday BETWEEN '2014-05-05' AND '2018-05-05';
This also uses a custom type which was created using this:
CREATE type json_type AS (gender int, birthday date)
My question is, is there an easier to read way to do this? I would like to use this query with many other JOIN
s and WHERE
clauses and I was wondering if there is a better way of doing this.
Note: this is mainly going to be used by a reporting system which does not need to be super fast but of course any speed gains are welcome.
Use the function jsonb_array_elements(),
examples:
select email, (elem->>'gender')::int as gender, (elem->>'birthday')::date as birthday
from users u
left join surveys s on s.user_id = u.id
cross join jsonb_array_elements(survey_data->'child_list') as arr(elem)
email | gender | birthday
-------------------+--------+------------
[email protected] | 1 | 2015-10-01
[email protected] | 2 | 2017-05-01
[email protected] | 2 | 2008-01-01
(3 rows)
or
select distinct email
from users u
left join surveys s on s.user_id = u.id
cross join jsonb_array_elements(survey_data->'child_list') as arr(elem)
where (elem->>'birthday')::date between '2014-05-05' and '2018-05-05';
email
-------------------
[email protected]
(1 row)
You can make your life easier using a view:
create view users_children as
select email, (elem->>'gender')::int as gender, (elem->>'birthday')::date as birthday
from users u
left join surveys s on s.user_id = u.id
cross join jsonb_array_elements(survey_data->'child_list') as arr(elem);
select distinct email
from users_children
where birthday between '2014-05-05' and '2018-05-05';