I'm using the following query to populate a table
of daily temperature extremes from a large set of readings. There are 33 thermometers, and readings are taken every minute; thus, there are about 46K readings (i.e. rows) per day. But only 33 rows get added to extremes
each day.
Originally I had imagined to run this query each time a new reading was inserted, in order to keep today's extremes up to date. However, I soon discovered this query takes a LONG time to run: on my MacBook 5½ minutes on a full day's readings.
I'd be very interested in some insight into why it's so slow, and perhaps how to make this query faster, or a better alternative. Note extremes
has both Sensor_ID
and Date
as primary keys, since that's what's unique about each row.
THANKS !!
insert into extremes(Date, Sensor_ID, `min`, `max`, `avg`)
select date(DateTime) as `Date`, Sensor_ID as Sensor_ID,
min(Value) as `min`, max(Value) as `max`, avg(Value) as `avg`
from readings where date(`DateTime`) = date(NOW())
group by date(DateTime), Sensor_ID
on duplicate key update
`min` = values(`min`), `max` = values(`max`), `avg` = values(`avg`);
As requested, here are the tables
CREATE TABLE `readings` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`Sensor_ID` int(11) NOT NULL,
`DateTime` datetime NOT NULL,
`Value` double NOT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `ID_UNIQUE` (`ID`),
KEY `ID_idx` (`Sensor_ID`),
CONSTRAINT `ID` FOREIGN KEY (`Sensor_ID`) REFERENCES `sensors` (`ID`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=54500039 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `extremes` (
`Date` datetime NOT NULL,
`Sensor_ID` int(11) NOT NULL,
`min` double DEFAULT NULL,
`max` double DEFAULT NULL,
`avg` double DEFAULT NULL,
`updates` int(11) DEFAULT '0',
PRIMARY KEY (`Date`,`Sensor_ID`),
KEY `ID_idx` (`Sensor_ID`),
CONSTRAINT `foo` FOREIGN KEY (`Sensor_ID`) REFERENCES `sensors` (`ID`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Add an index to the DateTime
column in your readings
table.
Then try the below SQL:
insert into extremes(Date, Sensor_ID, `min`, `max`, `avg`)
select date(DateTime) as `Date`, Sensor_ID as Sensor_ID,
min(Value) as `min`, max(Value) as `max`, avg(Value) as `avg`
from readings where `DateTime` >= date_format(curdate(), '%Y-%m-%d 00:00:00')
group by date(DateTime), Sensor_ID
on duplicate key update
`min` = values(`min`), `max` = values(`max`), `avg` = values(`avg`);