Search code examples
mysqlquery-optimization

Mysql "Using temporary" but no official reason matched in this query


If you look at the official documentation for MySql temporary tables:

http://dev.mysql.com/doc/refman/5.1/en/internal-temporary-tables.html

The reasons given are:

   The server creates temporary tables under conditions such as these:

      Evaluation of UNION statements.

      Evaluation of some views, such those that use the TEMPTABLE algorithm,
      UNION, or aggregation.

      Evaluation of statements that contain an ORDER BY clause and a
      different GROUP BY clause, or for which the ORDER BY or GROUP BY
      contains columns from tables other than the first table in the join queue.

      Evaluation of DISTINCT combined with ORDER BY may require a temporary table.

      For queries that use the SQL_SMALL_RESULT option, MySQL uses an
      in-memory temporary table, unless the query also contains elements
      (described later) that require on-disk storage.

      Evaluation of multiple-table UPDATE statements.

      Evaluation of GROUP_CONCAT() or COUNT(DISTINCT) expressions. 

None of these conditions are met in this query:

  select ttl.id AS id,
  ttl.name AS name,
  ttl.updated_at AS last_update_on,
  ttl.user_id AS list_creator,
  ttl.retailer_nomination_list AS nomination_list,
  ttl.created_at AS created_on,
  tv.name AS venue_name,
  from haha_title_lists ttl 
    left join haha_title_list_to_users tltu on  ((ttl.id = tltu.title_list_id))
    left join users u on  ((tltu.user_id = u.id)) 
    left join users u2 on  ((tltu.user_id = u2.id)) 
    left join haha_title_list_to_venues tlv on  ((ttl.id = tlv.title_list)) 
    left join haha_venue_properties tvp on  ((tlv.venue_id = tvp.id)) 
    left join haha_venues tv on  ((tvp.venue_id = tv.id))
    join haha_title_list_to_books tlb on  ((ttl.id = tlb.title_list_id))
    join wawa_title ot on  ((tlb.title_id = ot.title_id))
    join wawa_title_to_author ota on  ((ot.title_id = ota.title_id))
    join wawa_author oa on  ((ota.author_id = oa.author_id))
    group by ttl.id;

For this table:

  CREATE TABLE  haha_title_lists  (
     id  int(11) unsigned NOT NULL AUTO_INCREMENT,
     name  varchar(255) DEFAULT NULL,
     isbn  varchar(15) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
     created_at  datetime NOT NULL,
     updated_at  datetime NOT NULL,
     user_id  int(11) DEFAULT NULL,
     list_note  text,
     retailer_nomination_list  int(11) DEFAULT NULL,
    PRIMARY KEY ( id )
  ) ENGINE=InnoDB AUTO_INCREMENT=460 DEFAULT CHARSET=utf8

I would expect the PRIMARY KEY to be used, since this table only matches on id. What would cause the use of a temporary table?

If I run EXPLAIN on this query I get:

    +----+-------------+-------+--------+------------------------------------------------------------------------+---------------------------------------+---------+---------------------------------------+------+---------------------------------+
  | id | select_type | table | type   | possible_keys                                                          | key                                   | key_len | ref                                   | rows | Extra                           |
  +----+-------------+-------+--------+------------------------------------------------------------------------+---------------------------------------+---------+---------------------------------------+------+---------------------------------+
  |  1 | SIMPLE      | ttl   | ALL    | PRIMARY                                                                | NULL                                  | NULL    | NULL                                  |  307 | Using temporary; Using filesort |
  |  1 | SIMPLE      | tltu  | ref    | idx_title_list_to_user                                                 | idx_title_list_to_user                | 4       | wawa_ripple_development.ttl.id        |    1 | Using index                     |
  |  1 | SIMPLE      | u     | eq_ref | PRIMARY                                                                | PRIMARY                               | 4       | wawa_ripple_development.tltu.user_id  |    1 | Using index                     |
  |  1 | SIMPLE      | u2    | eq_ref | PRIMARY                                                                | PRIMARY                               | 4       | wawa_ripple_development.tltu.user_id  |    1 | Using index                     |
  |  1 | SIMPLE      | tlb   | ref    | idx_title_list_to_books_title_id,idx_title_list_to_books_title_list_id | idx_title_list_to_books_title_list_id | 4       | wawa_ripple_development.ttl.id        |   49 | Using where                     |
  |  1 | SIMPLE      | ot    | eq_ref | PRIMARY                                                                | PRIMARY                               | 4       | wawa_ripple_development.tlb.title_id  |    1 | Using index                     |
  |  1 | SIMPLE      | ota   | ref    | PRIMARY,title_id                                                       | title_id                              | 4       | wawa_ripple_development.ot.title_id   |    1 | Using where; Using index        |
  |  1 | SIMPLE      | oa    | eq_ref | PRIMARY                                                                | PRIMARY                               | 4       | wawa_ripple_development.ota.author_id |    1 | Using index                     |
  |  1 | SIMPLE      | tlv   | ALL    | NULL                                                                   | NULL                                  | NULL    | NULL                                  |  175 |                                 |
  |  1 | SIMPLE      | tvp   | eq_ref | PRIMARY                                                                | PRIMARY                               | 4       | wawa_ripple_development.tlv.venue_id  |    1 |                                 |
  |  1 | SIMPLE      | tv    | eq_ref | PRIMARY                                                                | PRIMARY                               | 4       | wawa_ripple_development.tvp.venue_id  |    1 |                                 |
  +----+-------------+-------+--------+------------------------------------------------------------------------+---------------------------------------+---------+---------------------------------------+------+---------------------------------+

Why do I get "Using temporary; Using filesort"?


Solution

  • First, some comments...

    The "using temp, using filesort" is often on the first line of the EXPLAIN, but the actual position of them could be anywhere. Furthermore there could be multiple tmps and/or sorts, even for a 1-table query. For example: ... GROUP BY aaa ORDER BY bbb may use one tmp for grouping and another for sorting.

    In newer versions, you can do EXPLAIN FORMAT=JSON SELECT... to get a blow-by-blow account -- it will be clear there how many tmps and sorts there are.

    "Filesort" is a misnomer. In many cases, the data may actually be collected in memory and sorted there. That is, "no file is harmed in the filming of the query". There are many reasons for deciding (either up-front, or later) to use a disk-based sort; I won't give those details in this answer. One way to check is SHOW STATUS LIKE 'Created_tmp%tables';. Another is via the slowlog.

    Only recently have some UNIONs been improved to avoid tmp tables -- in obvious cases where they aren't needed. Alas, unions are still single-threaded.

    Back to your question... Yes, your GROUP BY applies to the first table. But, for whatever reason, the optimizer chose to gather the data, then sort. The other option would have been to use the PRIMARY KEY(id) for ordering and grouping. Hmmm... I wonder what would happen if you added ORDER BY ttl.id? I'm guessing that the Optimizer is focusing on how to do the GROUP BY -- either by filesort or by collecting a hash in ram, and it decided that all the JOINs were too much to think through.