I have the following tables inside of a Postgresql database:
The shifts table contains "Morning Shift" and "Evening Shift" as possible types. Each shift can have multiple users and users can have multiple shifts.
I need a list of all shifts, grouped by their dates, and for every shift containing all users attending the shift. So something like this:
{
...
'2023-01-01': {
'morning_shift': [
{ 'username': 'Some user', 'role': 'Some role', 'tag': 'Some tag' },
...
],
'evening_shift': [
{ 'username': 'Some user', 'role': 'Some role', 'tag': 'Some tag' },
...
]
}
...
}
So far I have come up with the following query:
SELECT shifts.date,
array(
SELECT users.username
FROM shifts_users
JOIN shifts on shifts_users.shift_id = shifts.id
JOIN users on users.id = shifts_users.user_id
WHERE shifts.type = 'Morning Shift') as morning_shift,
array(
SELECT users.username
FROM shifts_users
JOIN shifts on shifts_users.shift_id = shifts.id
JOIN users on users.id = shifts_users.user_id
WHERE shifts.type = 'Evening Shift') as evening_shift
FROM shifts
GROUP BY shifts.date
ORDER BY shifts.date ASC
With that I receive the following result:
{
...
'2023-01-01': {
'morning_shift': [
'Some user',
...
],
'evening_shift': [
'Some user',
...
]
}
...
}
Now I ran into the problem, that I only can get one column with subqueries, which means I currently only get the usernames but not the other values I need.
Is there maybe a different approach I should use or a different table setup to make life easier?
Your query is more broken than you realize. Both subqueries are uncorrelated to the outer SELECT
and gather all users for all days for the given shift, not just the ones for the outer day. Using the same table name shifts
without alias in the subquery hides the outer table of the same name.
After untangling that mess, you'll find that you have to aggregate in two steps. Your original query could work like this:
SELECT date
, min(shift_users) FILTER (WHERE type = 'Morning Shift') AS morning_shift
, min(shift_users) FILTER (WHERE type = 'Evening Shift') AS evening_shift
FROM (
SELECT s.date, s.type
, ARRAY (
SELECT u.username
FROM shifts_users su
JOIN users u ON u.id = su.user_id
WHERE su.shift_id = s.id
) AS shift_users
FROM shifts s
) sub
GROUP BY date
ORDER BY date;
In the outer SELECT
I apply a "poor-man's crosstab" technique. See:
If you want whole rows from table users
instead of just the username
, you could use ARRAY (SELECT u FROM ...
. But an array of records is hard to read / process.
You may want a JSON representation instead. But there is no min(json)
aggregate function (nor min(jsonb)
or similar). There are various ways around this. I would rewrite the query as:
WITH su AS (
SELECT su.shift_id, array_agg(u.username) AS shift_users
FROM shifts_users su
JOIN users u ON u.id = su.user_id
GROUP BY su.shift_id
)
SELECT d.date, m.shift_users AS morning_shift, e.shift_users AS evening_shift
FROM (
SELECT date
, min(id) FILTER (WHERE type = 'Morning Shift') AS m_shift_id
, min(id) FILTER (WHERE type = 'Evening Shift') AS e_shift_id
FROM shifts s
GROUP BY date
) d
LEFT JOIN su m ON m.shift_id = d.m_shift_id
LEFT JOIN su e ON e.shift_id = d.e_shift_id
ORDER BY d.date;
Aggregate days (d
) and shift-users (su
) separately - the latter in a CTE, and then join to it twice. Should also be substantially faster while aggregating the whole table.
Now, to get whole user rows:
WITH su AS (
SELECT su.shift_id, array_agg(u) AS shift_users -- Postgres array of records
FROM ... -- rest is identical
u
being the reference to the whale table row (table alias in this case).
And since we got rid of aggregation in the outer SELECT
, we can switch to JSON at will:
WITH su AS (
SELECT su.shift_id, json_agg(u) AS shift_users -- JSON array of objects
FROM ... -- rest is identical