Search code examples
mysqlquery-optimization

How to avoid " Using index; Using temporary; Using filesort " in MySQL, 21 table JOIN


mysql Ver 14.14 Distrib 5.1.58, for redhat-linux-gnu (x86_64) using readline 5.1

I am taking over an old project. I have been asked to speed things up. I have by enabling the slow-query-log. I am examining the queries that appear in the slow-query-log. This query takes 10 seconds to run, and returns 11,000 rows.

  select 
    substring_index(ob1.literal,'/',1) AS sku4,
    substring_index(
    substring_index(ob1.literal,'/',2),'/',-(1)) AS sku5,
    substring_index(
    substring_index(ob1.literal,'/',3),'/',-(1)) AS sku6,
    substring_index(ob2.literal,'/',1) AS sku7,
    substring_index(
    substring_index(ob2.literal,'/',2),'/',-(1)) AS sku8,
    substring_index(
    substring_index(ob2.literal,'/',3),'/',-(1)) AS sku9,concat_ws(',',ob.level_one,
    substring_index(ob1.literal,'/',1),
    substring_index(ob2.literal,'/',1)) AS sku1_filter,concat_ws(',',ob.level_two,
    substring_index(
    substring_index(ob1.literal,'/',2),'/',-(1)),
    substring_index(
    substring_index(ob2.literal,'/',2),'/',-(1))) AS sku2_filter,concat_ws(',',ob.level_three,
    substring_index(
    substring_index(ob1.literal,'/',3),'/',-(1)),
    substring_index(
    substring_index(ob2.literal,'/',3),'/',-(1))) AS sku3_filter,
    t.title_id AS title_id,
    t.us_list_price AS us_list_price,
    t.uk_list_price AS uk_list_price,
    t.can_list_price AS can_list_price,
    t.aus_list_price AS aus_list_price,
    t.min_age AS min_age,
    t.max_age AS max_age,
    t.book_club AS book_club,
    t.best_seller AS best_seller,
    t.award_winner AS award_winner,
    t.asin AS asin,
    t.format AS format,
    ob.literal AS literal_1,
    ob1.literal AS literal_2,
    ob2.literal AS literal_3,
    t.series AS series,
    t.volume AS volume,
    group_concat(distinct concat(u.first_name,  ' ',  u.last_name) separator ',   ') AS marketer,
    a.group_letter AS group_letter,
    group_concat(distinct concat(u2.first_name,  ' ',  u2.last_name) separator ',   ') AS editor,
    oi.imprint_name AS imprint_name,
    ob.level_one AS level_one,
    ob.level_two AS level_two,
    ob.level_three AS level_three,
    ot1.territory_name AS rights,  
  (case when (isnull(t.active) or (t.active = '')) 
          then '' when (t.active = 'Y') 
          then 'Active' when (t.active = 'N') 
          then 'In Development' when (t.active = 'X') 
          then 'Terminated' when (t.active = 'T') 
          then 'Transmittal' end) AS status,
  (case when (isnull(t.format) or (t.format = '')) 
          then '' when (t.format = 4) 
          then 'Ebook' when (t.format = 8) 
          then 'Print Book' when (t.format = 9) 
          then 'Audio Book' end) AS format_name,
  t.title AS title,t.primary_isbn13 AS primary_isbn13,
  group_concat(distinct a.display_name order by a.display_name ASC separator ', ') AS contributors,
  group_concat(distinct c1.display_name order by c1.display_name ASC separator ', ') AS publishers,
  t.publish_date AS pub_date,
  group_concat(distinct g.name order by g.name ASC separator ', ') AS category,
  group_concat(distinct g0.id order by g0.id ASC separator ', ') AS secondary_category_ids,
  group_concat(distinct g0.name order by g0.name ASC separator ', ') AS secondary_categories
  from ((((((((((((((((((((wawa_title t 
        left join wawa_title_to_imprint tti on((t.title_id = tti.title_id))) 
        left join wawa_imprint oi on((tti.imprint_id = oi.imprint_id))) 
        left join wawa_title_to_supplier t2a on((t2a.title_id = t.title_id))) 
        left join wawa_territories ot1 on((t.territory_id = ot1.territory_id))) 
        left join wawa_supplier a on((a.supplier_id = t2a.supplier_id))) 
        left join wawa_sku ob on((ob.code = t.sku1))) 
        left join wawa_sku ob1 on((ob1.code = t.sku2))) 
        left join wawa_sku ob2 on((ob2.code = t.sku3))) 
        left join bll_formats tf on((t.format = tf.id))) 
        left join bll_suppliers_to_wawa_editors aoe on((a.supplier_id = aoe.supplier_id))) 
        left join bll_suppliers_to_wawa_marketing_contacts amc on((a.supplier_id = amc.supplier_id))) 
        left join bll_contacts c0 on((c0.id = amc.id))) 
        left join users u on((amc.user_id = u.id))) 
        left join users u2 on((aoe.user_id = u2.id)))
        join wawa_title_to_publisher t2p on((t2p.title_id = t.title_id)))
        join wawa_publisher p on((p.publisher_id = t2p.publisher_id)))
        join bll_contacts c1 on((c1.id = p.org_contact_id))) 
        left join wawa_title_to_genre t2g on((t2g.title_id = t.title_id))) 
        left join wawa_genres g on((g.id = t2g.genre_id_2))) 
        left join wawa_genres g0 on((g0.id = t2g.genre_id_3))) where ((t.title_id = t.title_id) and (t.active <> 'X'))
  group by t.title_id

The output of EXPLAIN is:

+----+-------------+-------+--------+-----------------------------------------------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+----------------------------------------------+
  | id | select_type | table | type   | possible_keys                                                                                 | key                                 | key_len | ref                                      | rows | Extra                                        |
  +----+-------------+-------+--------+-----------------------------------------------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+----------------------------------------------+
  |  1 | SIMPLE      | p     | index  | PRIMARY,org_contact_fk                                                                        | org_contact_fk                      | 5       | NULL                                     |   66 | Using index; Using temporary; Using filesort |
  |  1 | SIMPLE      | c1    | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.p.org_contact_id |    1 |                                              |
  |  1 | SIMPLE      | t2p   | ref    | idx_title_id,idx_publisher_id                                                                 | idx_publisher_id                    | 4       | wawa_ripple_development.p.publisher_id   |   66 | Using where                                  |
  |  1 | SIMPLE      | t     | eq_ref | PRIMARY,active_index,idx_title_active_isprimary                                               | PRIMARY                             | 4       | wawa_ripple_development.t2p.title_id     |    1 | Using where                                  |
  |  1 | SIMPLE      | t2g   | ref    | idx_title_id                                                                                  | idx_title_id                        | 4       | wawa_ripple_development.t2p.title_id     |    1 |                                              |
  |  1 | SIMPLE      | g     | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.t2g.genre_id_2   |    1 |                                              |
  |  1 | SIMPLE      | g0    | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.t2g.genre_id_3   |    1 |                                              |
  |  1 | SIMPLE      | tti   | ref    | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.t.title_id       |    1 | Using index                                  |
  |  1 | SIMPLE      | oi    | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.tti.imprint_id   |    1 |                                              |
  |  1 | SIMPLE      | t2a   | ref    | title_id                                                                                      | title_id                            | 4       | wawa_ripple_development.t.title_id       |    1 | Using index                                  |
  |  1 | SIMPLE      | ot1   | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.t.territory_id   |    1 |                                              |
  |  1 | SIMPLE      | a     | eq_ref | PRIMARY,wawa_supplier_venue_id_supplier_id                                                        | PRIMARY                             | 4       | wawa_ripple_development.t2a.supplier_id    |    1 |                                              |
  |  1 | SIMPLE      | ob    | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 29      | wawa_ripple_development.t.sku1         |    1 |                                              |
  |  1 | SIMPLE      | ob1   | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 29      | wawa_ripple_development.t.sku2         |    1 |                                              |
  |  1 | SIMPLE      | ob2   | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 29      | wawa_ripple_development.t.sku3         |    1 |                                              |
  |  1 | SIMPLE      | tf    | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.t.format         |    1 | Using index                                  |
  |  1 | SIMPLE      | aoe   | ref    | idx_by_supplier_and_editor,index_bll_suppliers_to_wawa_editors_on_supplier_id                       | idx_by_supplier_and_editor            | 4       | wawa_ripple_development.a.supplier_id      |    1 | Using index                                  |
  |  1 | SIMPLE      | amc   | ref    | idx_by_supplier_and_marketing_contact,index_bll_suppliers_to_wawa_marketing_contacts_on_supplier_id | idx_by_supplier_and_marketing_contact | 4       | wawa_ripple_development.a.supplier_id      |    1 | Using index                                  |
  |  1 | SIMPLE      | c0    | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.amc.id           |    1 | Using index                                  |
  |  1 | SIMPLE      | u     | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.amc.user_id      |    1 |                                              |
  |  1 | SIMPLE      | u2    | eq_ref | PRIMARY                                                                                       | PRIMARY                             | 4       | wawa_ripple_development.aoe.user_id      |    1 |                                              |
  +----+-------------+-------+--------+-----------------------------------------------------------------------------------------------+-------------------------------------+---------+------------------------------------------+------+----------------------------------------------+

I am worried about two things:

1.) Using index; Using temporary; Using filesort

2.) Using where

Am I right to be worried about these?

Any suggestions for what I can do to speed things up?

Some variables:

  mysql> show variables like '%buffer%';
  +-------------------------+---------+
  | Variable_name           | Value   |
  +-------------------------+---------+
  | bulk_insert_buffer_size | 8388608 |
  | innodb_buffer_pool_size | 8388608 |
  | innodb_log_buffer_size  | 1048576 |
  | join_buffer_size        | 131072  |
  | key_buffer_size         | 8384512 |
  | myisam_sort_buffer_size | 8388608 |
  | net_buffer_length       | 16384   |
  | preload_buffer_size     | 32768   |
  | read_buffer_size        | 131072  |
  | read_rnd_buffer_size    | 262144  |
  | sort_buffer_size        | 2097144 |
  | sql_buffer_result       | OFF     |
  +-------------------------+---------+

Solution

    • Using index (also type=index) -- This is a 'good' thing. It says that the work for that table can be done entirely in the BTree of the index, without touching the data.
    • Using temporary -- This says that at least one temp table is needed for the execution of the query. There are queries that absolutely must have a tmp table. So live with it. Note: the position of this phrase in the EXPLAIN does not indicate which table really needs the tmp or filesort. See EXPLAIN FORMAT=JSON SELECT ... for such detail.
    • Using filesort -- This says that at least one step required sorting. It does not say whether that sort had to hit the disk. Again, it may be absolutely necessary.
    • Using where -- does not mean much of anything
    • Using index condition -- This is different than "Using index". ICP means that a certain efficiency is being used for a complex WHERE; this is available only in newer versions.
    • eq_ref - One row in the next table is all that is needed. Good.
    • ref - Not 1:1, but 1:many. From the EXPLAIN, it seems to often be closer to 1:1.

    As for speeding it up...

    • Do you need LEFT? If not get rid of it; the optimizer may prefer some other order of tables.
    • The EXPLAIN says that not many rows needed to be fetched. (Caution: Explain's "Row" numbers are approximates.) So, I don't see much help here.
    • See if you can make "covering indexes" for some of the JOINs -- but do this only for the "ref" cases, not the "eq_ref + PRIMARY" cases. It looks like you have already done a thorough job of this.
    • What will you do with 11000 rows? This seems 'unreasonable' for a UI? If it is destined for some further processing, how often is it done? (That is, does 10 seconds really matter?)
    • What Engine are you using? How much RAM do you have? SHOW VARIABLES LIKE '%buffer%'; -- I am fishing for memory utilization.
    • JOIN + GROUP BY -- This often means an explosion of rows, followed by an implosion. Changing some of the JOIN to subqueries in the SELECT may improve things: ( SELECT ... FROM ... WHERE ... LIMIT 1 ) AS whatever.
    • Are any of the fields TEXT? That forces a "filesort" out of memory (hence, slower). Give me more details and perhaps we can work around it.

    In summary, the EXPLAIN looks pretty clean. The huge number of JOINs is another matter.

    Two potentially bad schema designs:

    • Arrays across tables (*_price)
    • Arrays across columns (level*, sku*, user*, genre*)

    Addenda

    For 16GB of RAM and a mixture of InnoDB and MyISAM, suggest key_buffer_size = 1500M and innodb_buffer_pool_size = 5G. Set those in my.cnf (or my.ini) and restart mysqld.