Search code examples
sqlpostgresql

Combine two tables doing row count based on hour value in timestamp


I've got two tables:

utilities:

id timestamp action
901 2024-08-11 09:59:25.000 on power
902 2024-08-11 09:59:35.000 on water
903 2024-08-11 09:59:55.000 off power
904 2024-08-11 10:01:25.000 on gas
905 2024-08-11 10:02:35.000 off water
906 2024-08-11 10:11:18.000 off power
907 2024-08-11 10:31:28.000 off gas
908 2024-08-11 11:15:37.000 on power

items:

id timestamp action
906 2024-08-11 09:59:45.000 on lights
907 2024-08-11 09:59:58.000 off lights
908 2024-08-11 10:15:34.000 on tap
909 2024-08-11 10:18:25.000 on heating
910 2024-08-11 10:21:44.000 off heating
911 2024-08-11 11:02:35.000 off tap
912 2024-08-11 12:01:08.000 open door
913 2024-08-11 12:11:28.000 closer door

I'm trying to combine them, without the ids clashing (generate a new id?) and then want to do a date_trunc('hour', timestamp) as time, COUNT(*) as metric to work out how many actions are happening in the table per hour, so the end result would be:

time metric
2024-08-11 09:00:00.000 5
2024-08-11 10:00:00.000 7
2024-08-11 11:00:00.000 2
2024-08-11 12:00:00.000 2

I tried this query, but it complains about:

Utilities.timestamp must appear in the GROUP BY ...

WITH one AS (
  SELECT 
  date_trunc('hour', timestamp) as timeOne, 
  COUNT(*) as utilities_count
  FROM Utilities
  ORDER BY timeOne
),
two AS (
  SELECT 
  date_trunc('hour', timestamp) as timeTwo, 
  COUNT(*) as item_count
  FROM Items
  ORDER BY timeTwo
)
SELECT 
  SUM(utilities_count, item_count) as metric,
  timeOne as time
FROM one, two
ORDER BY 1;

Any idea how to get the data combined and counted with the correct hourly binning?


Solution

  • Simply:

    SELECT date_trunc('hour', timestamp) AS time, count(*) AS metric
    FROM  (
       SELECT timestamp FROM utilities
       UNION ALL
       SELECT timestamp FROM items
       ) sub
    GROUP  BY 1
    ORDER  BY 1;
    

    Clashing IDs are not relevant to this query. Simply append rows from both tables with UNION ALL (not just UNION!), use date_trunc() and aggregate.

    I removed a more complex query that would only make sense with (unlikely) expression indexes on underlying tables.