Search code examples
mysqlsqlsql-updatemaxaggregate-functions

MySQL query optimization - add a new column with the max value of the group. The value should be repeated for the entire group


I have a mysql table with the following columns- Date, country, YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license,item_description,amount, units, arp.

Total number of records = 7.1 million

Now i want to add a new column which shows the max value of arp for the group (country, YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license,item_description). This max value has to repeated for the all the members in the group.

I have tried the following 2 methods. Both works but both of them takes a long time to update the data more than 2 hrs.

I'm looking for any better solutions that can do this faster. Please help.

Method 1- update the table from subquery

UPDATE table a
INNER JOIN (SELECT Country,YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license, item_description, MAX(arp) AS max_arp FROM table GROUP BY 1,2,3,4,5,6,7,8,9) b  

on a.country = b.country AND a.year = b.year AND a.month=b.month AND a.supercategory=b.supercategory AND a.class=b.class AND a.corp_manufacturer=b.corp_manufacturer AND a.brand=b.brand AND a.license=b.license AND a.item_description = b.item_description

SET a.max_arp= b.max_arp

Method 2 - create temp tables with indexes and join them in to a new table

INSERT INTO  temp1
(SELECT Country,YEAR, MONTH, supercategory, class, corp_manufacturer, brand, license, item_description, MAX(arp) AS max_arp 
    FROM table 
    GROUP BY 1,2,3,4,5,6,7,8,9);

INSERT IGNORE INTO  temp2
SELECT a.*,b.max_arp AS SRP FROM  table  a  JOIN  temp1 b
ON a.country = b.country AND a.year = b.year AND a.month=b.month AND a.supercategory=b.supercategory AND a.class=b.class AND a.corp_manufacturer=b.corp_manufacturer 
    AND a.brand=b.brand AND a.license=b.license AND a.item_description = b.item_description;


Solution

  • We can compute the group count with a window function. That said, why create a column when you can just use a view:

    create myview v as
    select t.*,
        max(arp) over(partition by 
            country, 
            year, 
            month, 
            supercategory, 
            class, 
            corp_manufacturer, 
            brand, 
            license,
            item_description
        ) max_arp
    from mytable t
    

    With this technique, you get an always up-to-date perspective at your data, for 0 maintenance costs. Note that this requires MySQL 8.0.

    If you insist on storing the information, then in MySQL I would recommend the update/join syntax. Assuming that your table has a primary key called id:

    update mytable t
    inner join (
        select id,
            max(arp) over(partition by 
                country, 
                year, 
                month, 
                supercategory, 
                class, 
                corp_manufacturer, 
                brand, 
                license,
                item_description
            ) max_arp
        from mytable t
    ) x on x.id = t.id
    set t.max_arp = x.max_arp