Search code examples
mysqlsqlindexingquery-optimizationinnodb

Optimize MySQL InnoDB query for max, count


I have an MySQL InnoDB table with 5.7M rows and 1.9GB size:

+-------------------+---------+------+-----+---------+----------------+
|       Field       |  Type   | Null | Key | Default |     Extra      |
+-------------------+---------+------+-----+---------+----------------+
| id                | int(20) | NO   | PRI | NULL    | auto_increment |
| listing_id        | int(20) | YES  |     | NULL    |                |
| listing_link      | text    | YES  |     | NULL    |                |
| transaction_title | text    | YES  |     | NULL    |                |
| image_thumb       | text    | YES  |     | NULL    |                |
| seller_link       | text    | YES  |     | NULL    |                |
| seller_name       | text    | YES  |     | NULL    |                |
| sale_date         | date    | YES  |     | NULL    |                |
+-------------------+---------+------+-----+---------+----------------+

Here are my my.ini settings for my 3GB RAM server:

key_buffer = 16M
max_allowed_packet = 16M
sort_buffer_size = 8M
net_buffer_length = 8K
read_buffer_size = 2M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 8M
log_error = "mysql_error.log"
innodb_autoinc_lock_mode=0
join_buffer_size = 8M
thread_cache_size = 8
thread_concurrency = 8
query_cache_size = 64M
query_cache_limit = 2M
ft_min_word_len = 4
thread_stack = 192K
tmp_table_size = 64M

innodb_buffer_pool_size = 2G
innodb_additional_mem_pool_size = 16M
innodb_log_file_size = 512M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 120
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_thread_concurrency = 16
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90

When I run next query it takes over 20 minutes to return the results:

SELECT transaction_title, 
       listing_id, 
       seller_name, 
       Max(sale_date) AS sale_date, 
       Count(*)       AS count 
FROM   sales_meta 
WHERE `sale_date` BETWEEN '2017-06-06' AND '2017-06-06' 
GROUP  BY listing_id 
HAVING Count(*) > 1 
ORDER  BY count DESC, 
          seller_name;

I've done some research and it appears I need to add some indexes to speed things up, but I am confused how to go about it. There are some single-column indexes and some multi-column indexes, which one should I do?

To make things more complicated, there are few other queries that I will need to perform on this table regularly:

SELECT * 
FROM   sales_meta 
WHERE ` sale_date `= '2017-06-06'; 

and

SELECT DISTINCT `seller_name` 
FROM   `sales_meta`; 

These two are probably less taxing, but I still need to optimize for them as well if possible, although the first query out of three is the top priority for now.


Solution

  • if you want just the value for a single day and the the data type is date then you could avoid the between clause and use =

        SELECT transaction_title, 
               listing_id, 
               seller_name, 
               Max(sale_date) AS max_sale_date, 
               Count(*)       AS count 
        FROM   sales_meta 
        WHERE sale_date =  str_to_date('2017-06-06', '%Y-%m-%d')  
        GROUP  BY listing_id 
        HAVING Count(*) > 1 
        ORDER  BY count DESC, seller_name;
    

    and be sure you have an index on sale_date