Search code examples
sqlpostgresqlaggregate-functionsweather

Is there a more concise way to write this PostgreSQL aggregate window function


I can calculate the rainfall since 9am for a site using a series of 10 minute records as a cumulative value, however the nested SQL seems verbose and I'd like to make it more concise for future maintainability. I could use a CTE which might make it a little cleaner, but wondering if there's an approach to get this in a single call with the SUM & PARTITION BY and not use my current CASE approach?

The following postgres query calculates the rainfall since 9am using 10 minute values. Firstly I add a field determining whether the day is before or after 9am - in this case I want to include 9am and zero minutes as part of the previous days "rainfall since 9am" value, but anything after i.e. 9:01 should be the point at which the running total reset. I know I'm using 10 minute records in this example, but I do want do this down to the minute as there will be other sites reporting at minutely frequency but this example site is just reporting at 10 minutes.

SELECT
    site_name,
    date_time_utc AT TIME ZONE 'Australia/Hobart' AS date_time_local,
    precip_10min,
    precip_since_9am,
    day_9am_val,
    SUM(precip_10min) OVER 
       (PARTITION BY site_name, day_9am_val ORDER BY date_time_utc) AS precip_since_9am_cal
FROM
    (
        SELECT *,
            CASE
                WHEN DATE_PART('hour', date_time_utc AT TIME ZONE 'Australia/Hobart') > 9 THEN (date_time_utc AT TIME ZONE 'Australia/Hobart')::date
                WHEN DATE_PART('hour', date_time_utc AT TIME ZONE 'Australia/Hobart') < 9 THEN ((date_time_utc AT TIME ZONE 'Australia/Hobart')::date - INTERVAL '1 day')::date
                WHEN DATE_PART('minutes', date_time_utc AT TIME ZONE 'Australia/Hobart') > 0 THEN (date_time_utc AT TIME ZONE 'Australia/Hobart')::date
                ELSE ((date_time_utc AT TIME ZONE 'Australia/Hobart')::date - INTERVAL '1 day')::date
            END AS day_9am_val
        FROM
            temp_export
    ) tbl

ORDER BY
    site_name, date_time_utc DESC

The expected value is provided as precip_since_9am and the calculated value (which should match) is precip_since_9am_cal. The day_9am_val is not required, it's just illustrating how I tackled partioning the days for a 9am cutoff. I've placed small subset of the data in SQL fiddle here, but also show some example rows below:

site_name date_time_local precip_10min precip_since_9am day_9am_val precip_since_9am_cal
sitea 2024-01-18 17:00:00 0.2 10.8 2024-01-18 10.8
sitea 2024-01-18 16:50:00 0.4 10.6 2024-01-18 10.6
sitea 2024-01-18 16:40:00 0.2 10.2 2024-01-18 10.2
sitea 2024-01-18 16:30:00 0.2 10 2024-01-18 10.0
sitea 2024-01-18 16:20:00 0.4 9.8 2024-01-18 9.8

Solution

  • I ended up rewriting to remove the sub-select per NickW's suggestion and include the case by directly in the partioning fields:

    SELECT
        site_name,
        date_time_utc AT TIME ZONE 'Australia/Hobart'                                 AS date_time_local,
        precip_10min,
        precip_since_9am,
        SUM(precip_10min) OVER (PARTITION BY site_name, (CASE
                                                           WHEN (date_time_utc AT TIME ZONE 'Australia/Hobart')::time > '9:00'
                                                           THEN (date_time_utc AT TIME ZONE 'Australia/Hobart')::date
                                                           ELSE ((date_time_utc AT TIME ZONE 'Australia/Hobart')::date - INTERVAL '1 day')::date
                                                           END) ORDER BY date_time_utc) AS precip_since_9am_calc
    FROM
        temp_export
    
    ORDER BY
        site_name, date_time_utc DESC