Search code examples
mysqlsqlgreatest-n-per-group

How to get first and last row in hour from table


I am trying to create a graph which shows power consumption for every hour within a day. My device sends data every 15 min to my server, and previous power_kwh is always equal or more than current value of power_kwh within sensor_id. In some cases I could have two or more power meters for one building, so I need to think about combining values with same periods.So I want to retrieve two rows for each hour with first and last value with in clause containing sensor_id values So I have table with values: date,power_kwh, sensor_id mytable

After searching internet for a while and I found query which is good but problems with sensor_id

SELECT
    DATE,
    power_kwh,
    sensor_id
FROM
    sensor_data
WHERE
    (HOUR(DATE), MINUTE(DATE)) IN(
        SELECT
            HOUR(DATE),
            MIN(MINUTE(DATE))
        FROM
            sensor_data
        GROUP BY
            HOUR(DATE)
    )
UNION
SELECT
    DATE,
    power_kwh,
    sensor_id
FROM
    sensor_data
WHERE
    (HOUR(DATE), MINUTE(DATE)) IN(
        SELECT
            HOUR(DATE),
            MAX(MINUTE(DATE))
        FROM
            sensor_data
        GROUP BY
            HOUR(DATE)
    );
My current result
+----------------------------+-----------+-----------+
| date                       | power_kwh | sensor_id |
+----------------------------+-----------+-----------+
| 2020-03-12 15:40:03.000000 | 682685.56 |         4 |
| 2020-03-12 15:59:03.000000 | 682688.44 |         5 |
| 2020-03-12 16:00:03.000000 | 682688.56 |         5 |
| 2020-03-12 16:59:06.000000 | 682697.44 |         5 |
| 2020-03-12 17:00:06.000000 | 682697.56 |         5 |
| 2020-03-12 17:59:08.000000 | 682706.44 |         5 |
| 2020-03-12 18:00:08.000000 | 682706.56 |         5 |
| 2020-03-12 18:59:11.000000 | 682715.44 |         5 |
| 2020-03-12 19:00:11.000000 | 682715.56 |         5 |
| 2020-03-12 19:59:13.000000 | 682724.44 |         5 |
| 2020-03-12 20:00:13.000000 | 682724.56 |         5 |
| 2020-03-12 20:59:16.000000 | 682733.44 |         5 |



My expected result
+----------------------------+-----------+-----------+
| date                       | power_kwh | sensor_id |
+----------------------------+-----------+-----------+
| 2020-03-12 15:40:03.000000 | 153566.34 |         4 |
| 2020-03-12 15:59:03.000000 | 153575.44 |         4 |
| 2020-03-12 15:00:02.000000 | 682688.56 |         5 |
| 2020-03-12 15:58:06.000000 | 682697.44 |         5 |
| 2020-03-12 16:00:06.000000 | 153576.23 |         4 |
| 2020-03-12 16:59:08.000000 | 153585.44 |         4 |
| 2020-03-12 16:02:08.000000 | 682706.56 |         5 |
| 2020-03-12 16:59:11.000000 | 682715.44 |         5 |


My sql version is

mysql  Ver 8.0.19 for osx10.14 on x86_64 (Homebrew)

And one more thing, I thinking about speed and other stuff, so if there is no solution can you give me advice with processing it in a Java Spring Let me know if you guys need more information. Thank you!!!


Solution

  • If you are running MySQL 8.0, you can use window functions:

    select date, power_kwh, sensor_id
    from (
        select 
            s.*,
            row_number() over(partition by s.sensor_id, date(s.date), hour(s.date) order by s.date) rn_asc,
            row_number() over(partition by s.sensor_id, date(s.date), hour(s.date) order by s.date desc) rn_desc
        from sensor_data s
    ) t
    where rn_asc = 1 or rn_desc = 1
    

    In earlier versions, you can join with an aggregate query:

    select s.date, s.power_kwh, s.sensor_id
    from sensor_data s
    inner join (
        select sensor_id, min(date) min_date, max(date) max_date
        from sensor_data
        group by sensor_id, date(date), hour(date)
    ) g
        on g.sensor_id = s.sensor_id
        and s.date in (g.min_date, g.max_date)