Search code examples
mysqlquery-optimizationquery-performance

Joined query index not applied


I have a query:

 SELECT `dsd_prefix`,
      `dsd_partner`,
      `eev1`.`eev_dse_element_name`,
      `devd_explanation`,
      `devd_min`,
      `eev1`.`eev_dev_value`,
      `devd_max`,
      `devd_format`,
      `devd_not_applicable`,
      `devd_not_available`,
      `dsd_nid`
       FROM `devdescription` 
 INNER JOIN ekohubelementvalue AS `eev1`
             ON `eev1`.`eev_dse_element_name` = `devd_element_name`
            AND `eev1`.`eev_prefix` = `devd_prefix`
  LEFT JOIN `ekohubelementvalue` AS `eev2`
               ON `eev1`.`eev_prefix` = `eev2`.`eev_prefix`
              AND `eev1`.`eev_dse_element_name` = `eev2`.`eev_dse_element_name`
              AND `eev1`.`eev_subcategory` = `eev2`.`eev_subcategory`
              AND `eev1`.`eev_company_id` = `eev2`.`eev_company_id`
              AND `eev2`.`eev_date_updated` > `eev1`.`eev_date_updated`
INNER JOIN `datasourcedescription`
               ON `eev1`.`eev_prefix` = `dsd_prefix`
      WHERE (`eev1`.`eev_company_id` = 'ADD4027'
            AND `eev2`.`eev_date_updated` IS NULL
            AND `dsd_type_id` != 'MAJ'
            AND `dsd_hide` = 'No'
            AND (`devd_supress` IS NULL OR `devd_supress` <> 'Yes'))
   GROUP BY `eev1`.`eev_dse_element_name`, `eev1`.`eev_prefix`
   ORDER BY dsd_prefix

EXPLAIN of this query:

+----+-------------+-----------------------+------------+------+-----------------------------------------------------------------------------------------------------------------+---------------------+---------+--------------------------------------------------------------------------------------------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table                 | partitions | type | possible_keys                                                                                                   | key                 | key_len | ref                                                                                                                      | rows | filtered | Extra                                        |
+----+-------------+-----------------------+------------+------+-----------------------------------------------------------------------------------------------------------------+---------------------+---------+--------------------------------------------------------------------------------------------------------------------------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | datasourcedescription | NULL       | ALL  | PRIMARY,datasourcedescription_dsd_type_id                                                                       | NULL                | NULL    | NULL                                                                                                                     |  688 |    10.00 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | eev1                  | NULL       | ref  | eev_prefix,eev_company_id,earliest_and_latest,slice_by_date_for_company,sources_for_special_issue               | earliest_and_latest | 47      | csrhub_data_1.datasourcedescription.dsd_prefix                                                                           |  607 |     0.04 | Using where                                  |
|  1 | SIMPLE      | devdescription        | NULL       | ref  | reports,supress,devd_element_name                                                                               | reports             | 816     | csrhub_data_1.datasourcedescription.dsd_prefix,csrhub_data_1.eev1.eev_dse_element_name                                   |    1 |    50.00 | Using where                                  |
|  1 | SIMPLE      | eev2                  | NULL       | ref  | eev_prefix,eev_company_id,earliest_and_latest,slice_by_date,slice_by_date_for_company,sources_for_special_issue | eev_prefix          | 861     | csrhub_data_1.datasourcedescription.dsd_prefix,csrhub_data_1.eev1.eev_dse_element_name,csrhub_data_1.eev1.eev_company_id |   17 |    19.00 | Using where                                  |
+----+-------------+-----------------------+------------+------+-----------------------------------------------------------------------------------------------------------------+---------------------+---------+--------------------------------------------------------------------------------------------------------------------------+------+----------+----------------------------------------------+

As you can see the datasourcedescription indexes are not being used though they exist in posible_keys. The key column is NULL.

SHOW INDEXES FROM datasourcedescription;

+-----------------------+------------+-----------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table                 | Non_unique | Key_name                          | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-----------------------+------------+-----------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| datasourcedescription |          0 | PRIMARY                           |            1 | dsd_prefix  | A         |         688 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| datasourcedescription |          1 | datasourcedescription_dsd_type_id |            1 | dsd_type_id | A         |           8 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+-----------------------+------------+-----------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

How to make the optimizer utilize datasourcedescription indexes?

As response to @O. Jones: The datasourcedescription columns are dsd_prefix, dsd_type_id and dsd_hide

The table datasourcedescription has 727 rows. The table ekohubelementvalue has nearly 300,000,000 (300M) rows


Solution

  • You metion the ekohubelementvalue has nearly 3M rows. Your where clause was based on a specific company ID. I would rewrite the query slightly, but also, ensure the ekohubelementvalue table has an index with the company id in the primary position and other columns to help cover the join/wehre criteria where possible. Also with MySQL, I would add the "STRAIGHT_JOIN" keyword to tell MySQL to query in the order you provided vs it guessing which order.

    I would have the following indexes available

    ekohubelementvalue index on ( eev_company_id, eev_prefix, eev_dse_element_name, eev_subcategory, eev_date_updated )
    devdescription index on ( devd_element_name, devd_prefix, devd_supress )
    datasourcedescription index on ( dsd_prefix, dsd_type_id, dsd_hide )
    

    Since the order was by the dsd_prefix, but that was joined by the eev_prefix, use the eev_prefix from the primary table which already has optimized index component, let the primary table (not the lookups) be the basis of the group/order.

    I also cleaned-up the query some. Easier to give aliases to long table names so you can use the alias for qualifying each column in the query and respective joins.

    SELECT STRAIGHT_JOIN
            dsd.dsd_prefix,
            dsd.dsd_partner,
            eev1.eev_dse_element_name,
            devd.devd_explanation,
            devd.devd_min,
            eev1.eev_dev_value,
            devd.devd_max,
            devd.devd_format,
            devd.devd_not_applicable,
            devd.devd_not_available,
            dsd.dsd_nid
        FROM 
            ekohubelementvalue AS eev1
                INNER JOIN devdescription devd
                    ON eev1.eev_prefix = devd.devd_prefix
                    AND eev1.eev_dse_element_name = devd.devd_element_name
                            
                LEFT JOIN ekohubelementvalue AS eev2
                    ON eev1.eev_company_id = eev2.eev_company_id
                    AND eev1.eev_prefix = eev2.eev_prefix
                    AND eev1.eev_dse_element_name = eev2.eev_dse_element_name
                    AND eev1.eev_subcategory = eev2.eev_subcategory
                    AND eev1.eev_date_updated < eev2.eev_date_updated
    
                INNER JOIN datasourcedescription dsd
                    ON eev1.eev_prefix = dsd.dsd_prefix
                    AND dsd.dsd_type_id != 'MAJ'
                    AND dsd.dsd_hide = 'No'
    
        WHERE 
                eev1.eev_company_id = 'ADD4027'
            AND ( devd.devd_supress IS NULL 
                OR devd.devd_supress <> 'Yes')
            AND eev2.eev_date_updated IS NULL
        GROUP BY 
            eev1.eev_prefix,
            eev1.eev_dse_element_name
        ORDER BY 
            eev1.eev_prefix