Search code examples
sqlpostgresqljsonb

Querying JSONB fields with joins


I have a database with two tables. One containing events & the other containing attendees for those events. Theses events have a jsonb field containing an array of objects with a "duration" attribute

events

id name occurrences
1 "The Great Beyond" [{"duration": 100}, {"duration": 200}, {"duration": 300}]
2 "The Final Countdown" [{"duration": 50}]
3 "The Next Plain" null

attendees

id name event_id
1 "Bob" 1
2 "Susan" 2
3 "Terry" 3

I want to run a single query to pull data about these events & showcase the name of the event, how many people are attending & what the current "duration" is for each event (by summing up all the duration values in the occurrences column.

The current query I use is resulting in incorrect COUNT values for my attendees. I suspect it has to do with the way I am constructing my JOINs such that additional rows are being created.

SELECT
  events.id AS "ID",
  events.name AS "Name",
  SUM(coalesce(occurrences_arry->'duration', '0'):int) as "Duration",
  COUNT(attendees.*) as "Attendees"
FROM
  events
    INNER JOIN attendees on attendees.event_id = events.id
    LEFT JOIN jsonb_array_elements(events.occurrences) as occurrences_arry on true
GROUP BY events.id

The result I get back has too high of a count for "Attendees" (The last record should have 1 attendees but says 3). I am pretty sure it has to do with the INNER JOIN LEFT JOIN combo I am using so that I can utilize jsonb_array_elements. Every time a new occurrence is added the attendees count grows 😵‍💫

ID Name Duration Attendees
2 "The Final Countdown" 50 1
3 "The Next Plain" 0 1
1 "The Great Beyond" 600 3

How can I best perform an INNER JOIN on my attendees table and ALSO sum up all the durations in for each event?


Solution

  • You can count the duration in a subquery. You have most of the code that is needed already, you just have to move it, example:

    SELECT
      events.id AS "ID",
      events.name AS "Name",
      (SELECT coalesce(SUM((occurrences_arry->>'duration')::int), 0)
       FROM jsonb_array_elements(events.occurrences) as occurrences_arry) as "Duration",
      COUNT(attendees.*) as "Attendees"
    FROM
      events
        INNER JOIN attendees on attendees.event_id = events.id
    GROUP BY events.id
    

    A SUM returns null, if there are no rows, hence I used coalesce to default to 0 in this case.