I have a sales table in MySQL (InnoDB). It's +- 1 million records big. I would like to show some nice charts. Fetching the right data is not a problem. Fetching it fast is...
So I like to count the amount of sales in table A grouped per day (later on also month, and year) for PERIOD A till Z. Concrete; for the last 30 days I like to know for each day how many sales records we have in the DB.
So MySQL would have to return something like this:
I like to achieve that MySQL returns the data like this:
date, count
2017-04-01, 2482
2017-04-02, 1934
2017-04-03, 2701
...
The structure of the Sales basically like this:
CREATE TABLE `sales` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `contacts_created_at_index` (`created_at`),
KEY `contacts_deleted_at_index` (`deleted_at`),
KEY `ind_created_at_deleted_at` (`created_at`,`deleted_at`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Some days (datapoints) might not have any results, but I don't like to have gaps in the data. So I also have some 'calendar' table.
CREATE TABLE `time_dimension` (
`id` int(11) NOT NULL,
`db_date` date NOT NULL,
`year` int(11) NOT NULL,
`month` int(11) NOT NULL,
`day` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `td_ymd_idx` (`year`,`month`,`day`),
UNIQUE KEY `td_dbdate_idx` (`db_date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Fetching 30 rows (30 days) with a count per day takes 30 secs...
This is the first query I tried:
SELECT
`db_date` AS `date`,
(SELECT
COUNT(1)
FROM
sales
WHERE
DATE(created_at) = db_date) AS count
FROM
`time_dimension`
WHERE
`db_date` >= '2017-04-11'
AND `db_date` <= '2017-04-25'
ORDER BY `db_date` ASC
But like I said it's really slow (11.9 secs). I tried al kinds of other approaches, but without luck. For example:
SELECT time_dimension.db_date AS DATE,
COUNT(1) AS count
FROM sales RIGHT JOIN time_dimension ON (DATE(sales.created_at) =
time_dimension.db_date)
WHERE
(time_dimension.db_date BETWEEN '2017-03-11' AND '2017-04-11')
GROUP BY
DATE
A query for just 1 datapoint takes only 5.4ms:
SELECT COUNT(1) FROM sales WHERE created_at BETWEEN '2017-04-11 00:00:00' AND '2017-04-25 23:59:59'
I haven't checked innodb_buffer_poolsize on my local machine. I will check that as well. Any ideas on how to make queries like this fast? In the future I would even need to where clauses and joins, to filter the set of sales records..
Thanks.
Nick
You could try to count sale data first, then join count result with your calendar table.
SELECT time_dimension.db_date AS date,
by_date.sale_count
FROM time_dimension
LEFT JOIN (SELECT DATE(sales.created_at) sale_date,
COUNT(1) AS sale_count
FROM sales
WHERE created_at BETWEEN '2017-03-11 00:00:00' AND
'2017-04-11 23:59:59'
GROUP BY DATE(sales.created_at)) by_date
ON time_dimension.db_date = by_date.sale_date
WHERE time_dimension.db_date BETWEEN '2017-03-11' AND '2017-04-11'