Search code examples
mysqlquery-optimization

How to optimize Mysql Query which has HAVING Clause?


While I execute below query, it's scanning all rows from the table b to get the result. How to solve this issue, I already tried indexing the product_id and product_category_variants_values_id from table be. Even tho, I am getting the same effect.

SELECT
    a.product_id, a.product_name, a.product_qty, a.product_stock_status, a.product_type,
    a.product_mode, a.slug, a.product_price, a.product_sales_price 
FROM `product_meta` a 
JOIN `product_variants` b ON a.product_id = b.product_id 
WHERE a.product_category = 1 AND 
    CASE
        WHEN a.product_sales_price>0 THEN a.product_sales_price BETWEEN 0 AND 100000
        ELSE a.product_price BETWEEN 0 AND 100000
    END 
GROUP BY a.product_id 
HAVING SUM(b.product_category_variants_values_id = 24)
AND SUM(b.product_category_variants_values_id = 9);

Query of Table a :

    CREATE TABLE `product_meta` (
  `product_id` int(11) NOT NULL AUTO_INCREMENT,
  `product_name` varchar(500) DEFAULT NULL,
  `product_price` int(50) NOT NULL DEFAULT 0,
  `product_sales_price` int(50) DEFAULT NULL,
  `product_qty` int(10) NOT NULL DEFAULT 0,
  `product_stock_status` int(10) NOT NULL DEFAULT 1,
  `product_category` int(10) NOT NULL DEFAULT 0,
  `product_tag` int(10) NOT NULL DEFAULT 0,
  `product_type` varchar(20) NOT NULL DEFAULT '0',
  `product_mode` varchar(30) NOT NULL DEFAULT '0',
  `slug` varchar(500) NOT NULL,
  PRIMARY KEY (`product_id`),
  KEY `product_category_search` (`product_category`),
  KEY `product_tag_search` (`product_tag`),
  KEY `indx_slug` (`slug`)
)

Query of Table b :

            CREATE TABLE `product_variants` (
  `product_variants_id` int(11) NOT NULL AUTO_INCREMENT,
  `created_date` date NOT NULL DEFAULT current_timestamp(),
  `updated_date` date NOT NULL DEFAULT current_timestamp(),
  `product_id` varchar(50) NOT NULL,
  `product_category_variants_id` varchar(50) NOT NULL,
  `product_category_variants_values_id` varchar(50) NOT NULL,
  PRIMARY KEY (`product_variants_id`),
  KEY `indx_product_id_product_category_variants_values_id` (`product_category_variants_values_id`,`product_id`)
)

Query explain result,

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE b ALL indx_product_id NULL NULL NULL 26287 Using temporary; Using filesort
1 SIMPLE a eq_ref PRIMARY,product_category_search PRIMARY 4 v3.b.product_id 1 Using where

Kindly tell me what to do to reduce the complete row scanning from table b.


Solution

  • You need a composite index on (b.product_category_variants_values_id, b.product_id):

    ALTER TABLE product_variants
        ADD KEY (product_category_variants_values_id, product_id)
    

    Because the column being indexed is a VARCHAR you should quote the integer values being passed in, to avoid implicit type conversion.

    Move the product variant criteria to the join and change the HAVING clause:

    SELECT
        pm.product_id, pm.product_name, pm.product_qty, pm.product_stock_status, pm.product_type,
        pm.product_mode, pm.slug, pm.product_price, pm.product_sales_price 
    FROM `product_meta` `pm` 
    JOIN `product_variants` `pv`
        ON pm.product_id = pv.product_id
        AND pv.product_category_variants_values_id IN ('24', '9')
    WHERE pm.product_category = 1 AND 
        CASE
            WHEN pm.product_sales_price > 0 THEN pm.product_sales_price BETWEEN 0 AND 100000
            ELSE pm.product_price BETWEEN 0 AND 100000
        END 
    GROUP BY pm.product_id 
    HAVING COUNT(DISTINCT pv.product_category_variants_values_id) = 2;
    

    Moving the aggregation into a subquery might give slightly better performance, depending on the distribution of your data:

    SELECT
        pm.product_id, pm.product_name, pm.product_qty, pm.product_stock_status, pm.product_type,
        pm.product_mode, pm.slug, pm.product_price, pm.product_sales_price
    FROM product_meta pm
    JOIN (
        SELECT product_id
        FROM product_variants
        WHERE product_category_variants_values_id IN ('24', '9')
        GROUP BY product_id
        HAVING COUNT(DISTINCT product_category_variants_values_id) = 2
    ) pv ON pm.product_id = pv.product_id
    WHERE pm.product_category = 1 AND
        CASE
            WHEN pm.product_sales_price>0 THEN pm.product_sales_price BETWEEN 0 AND 100000
            ELSE pm.product_price BETWEEN 0 AND 100000
        END;
    

    Or, you could move the pv criteria to separate EXISTS (probably more efficient):

    SELECT
        pm.product_id, pm.product_name, pm.product_qty, pm.product_stock_status, pm.product_type,
        pm.product_mode, pm.slug, pm.product_price, pm.product_sales_price 
    FROM `product_meta` `pm` 
    WHERE pm.product_category = 1 AND 
        CASE
            WHEN pm.product_sales_price > 0 THEN pm.product_sales_price BETWEEN 0 AND 100000
            ELSE pm.product_price BETWEEN 0 AND 100000
        END
    AND EXISTS (SELECT 1 FROM product_variants WHERE product_id = pm.product_id AND product_category_variants_values_id = '24')
    AND EXISTS (SELECT 1 FROM product_variants WHERE product_id = pm.product_id AND product_category_variants_values_id = '9');
    

    YMMV, give them both a try.