Search code examples
arraysjsonpostgresqlaggregate-functionsjsonb

Postgres sum of array stored in jsonb


I have a postgres database where some data are stored as jsonb arrays:

 id |         start          |   duration   |   value    
----+------------------------+--------------+------------
  1 | 2019-01-04 18:34:00+01 | [60]         | [7]
  2 | 2019-01-04 18:44:00+01 | [60]         | [9]
  3 | 2019-01-04 19:00:00+01 | [60]         | [6]
  4 | 2019-01-04 19:06:00+01 | [60]         | [17]
  5 | 2019-01-04 19:19:00+01 | [60]         | [9]
  6 | 2019-01-04 19:41:00+01 | [60, 60, 60] | [13, 8, 9]
  7 | 2019-01-04 19:46:00+01 | [60]         | [7]
  8 | 2019-01-04 19:49:00+01 | [60]         | [0]

I would like to get the sum of all the values in the array in the 'value'-field.

I can get all the values from the arrays using jsonb_array_elements:

=# select jsonb_array_elements(value),value from step limit 20;
 jsonb_array_elements |   value    
----------------------+------------
 7                    | [7]
 9                    | [9]
 6                    | [6]
 17                   | [17]
 9                    | [9]
 13                   | [13, 8, 9]
 8                    | [13, 8, 9]
 9                    | [13, 8, 9]
 7                    | [7]

and so on. So I thought

select sum(jsonb_array_elements(value)::integer),start from step group by start

would do it, but I am told: ERROR: aggregate function calls cannot contain set-returning function calls HINT: You might be able to move the set-returning function into a LATERAL FROM item.

I have been looking a little bit into LATERAL FROM, but I still don't really get what postgres wants me to do...

Would it be easier to do this if I store the duration and value as arrays rather than json?


Solution

  • Use the function in a lateral join:

    select start, sum(number::int)
    from step s
    cross join jsonb_array_elements_text(value) as number
    group by start
    
             start          | sum 
    ------------------------+-----
     2019-01-04 19:00:00+01 |   6
     2019-01-04 19:46:00+01 |   7
     2019-01-04 18:44:00+01 |   9
     2019-01-04 19:19:00+01 |   9
     2019-01-04 18:34:00+01 |   7
     2019-01-04 19:06:00+01 |  17
     2019-01-04 19:49:00+01 |   0
     2019-01-04 19:41:00+01 |  30
    (8 rows)
    

    This cross join is a lateral join, the function is executed once for each row from step.