Search code examples
sqlitegroup-bywhere-clausesql-deletesql-in

delete all rows keep only one per hour group by strftime


I have a Home Assistant sqlite file.

My problem is that it grows fast and reaches enormous dimensions. So I made a php script that cleans it every 24 hours. I delete old and irrelevant data and leave only the data I need.

In addition I wanted to optimize the data I leave and delete for each sensor all the data except one value per hour, it will give me the information I need with all the graphs and reduce the file size.

I was thinking of using the not in option and wrote the following command:

 DELETE from states where state_id not in (select state_id, strftime ('% Y-% m-% d% H', last_changed, 'localtime') as date_hour from states group by entity_id, date_hour);

I sort all my sensors by sensor ID and by time and date and delete everything except one per hour.

But when I ran the command I got: Error: only a single result allowed for a SELECT that is part of an expression

Since I can not remove the date_hour column how do you guys think I should do it? Thank you!


Solution

  • The subquery returns 2 columns so you must compare 2 columns against its results:

    DELETE FROM states 
    WHERE (entity_id, last_changed) NOT IN (
      SELECT entity_id, MAX(last_changed)
      FROM states 
      GROUP BY entity_id, strftime('%Y-%m-%d% %H', last_changed) 
    ); 
    

    See the demo.