Search code examples
sqlmariadbaveragemariadb-10.5

MAriadb optimizing query for averaging


I have database like this (in real there are over 30 different sKey):

+----+------+------+---------------------+
|ID  | sKey | sVal |      timestamp      |
+----+------+------+---------------------+
| 1  | temp |   19 | 2023-07-14 20:32:06 |
| 2  | humi |   60 | 2023-07-14 20:33:06 |
| 3  | temp |   20 | 2023-07-14 20:34:06 |
| 4  | humi |   65 | 2023-07-14 20:35:06 |
| 5  | pres | 1023 | 2023-07-14 20:36:06 |
| 6  | temp |   22 | 2023-07-14 20:37:06 |
| 7  | temp |   21 | 2023-07-14 20:38:06 |
| 8  | pres | 1028 | 2023-07-14 20:39:06 |
| 9  | temp |   20 | 2023-07-14 20:40:06 |
|10  | temp |   19 | 2023-07-14 20:43:06 |  <-time glitch
|11  | pres | 1022 | 2023-07-14 20:44:06 |
|12  | temp |   19 | 2023-07-14 20:45:06 |
|13  | humi |   66 | 2023-07-14 20:46:06 |
|14  | humi |   63 | 2023-07-14 20:47:06 |
|15  | temp |   19 | 2023-07-14 20:48:06 |
|16  | pres | 1029 | 2023-07-14 20:49:06 |
|20  | temp |   19 | 2023-07-14 20:50:06 | <- ID not consecutive (deleted records)
|21  | pres | 1022 | 2023-07-14 20:61:06 |
|22  | temp |   19 | 2023-07-14 20:62:06 |
|23  | pres | 1029 | 2023-07-14 20:63:06 |
+----+------+------+---------------------+

Now I wish to get averages for each sKey (with value in sVal). For now I have working solution but it's to slow. Actually I have 3 separate queries like this:

SELECT AVG(`sVal`), `timestamp` FROM `Test` WHERE sKey='temp'  AND timestamp between '2023-07-14 20:34:06' and '2023-07-14 20:51:06' GROUP BY FLOOR(TO_SECONDS(`timestamp`)/180)
SELECT AVG(`sVal`), `timestamp` FROM `Test` WHERE sKey='humi'  AND timestamp between '2023-07-14 20:34:06' and '2023-07-14 20:51:06' GROUP BY FLOOR(TO_SECONDS(`timestamp`)/180)
SELECT AVG(`sVal`), `timestamp` FROM `Test` WHERE sKey='pres'  AND timestamp between '2023-07-14 20:34:06' and '2023-07-14 20:51:06' GROUP BY FLOOR(TO_SECONDS(`timestamp`)/180)

As database is allready over 2000000 records now, the single query take around 3 seconds. I assume if I can somehow join query to just one as only WHERE sKey=... the result should be faster. So how to improve that.

Or maybe I have wrong approach at all. The wanted result is to get averaged sVal per each sKey. The averaging interval cant be by record numbers (ID) as some record may be deleted. Even if ID is in row there may be some record missed. So I think only interval in timestamp itself can be used. But I'm novice in SQL and I'm possible miss something.

Average time (3 minutes in example) may be any value.


Solution

  • I don't understand why you don't group by skey, what's wrong with

    SELECT AVG(`sVal`), `skey`, `timestamp` FROM `Test`
    WHERE sKey in ('temp','humi','pres')  AND 
       timestamp between '2023-07-14 20:34:06' and '2023-07-14 20:51:06'
    GROUP BY  FLOOR(TO_SECONDS(`timestamp`)/180), `skey`
    

    and mysql is something diffent in grouping and allows your construct, shouldn't it be:

    SELECT AVG(`sVal`), `skey`, FLOOR(TO_SECONDS(`timestamp`)/180) FROM `Test`
    WHERE sKey in ('temp','humi','pres')  AND 
       timestamp between '2023-07-14 20:34:06' and '2023-07-14 20:51:06'
    GROUP BY  FLOOR(TO_SECONDS(`timestamp`)/180), `skey`