Search code examples
mysqlsqlmysql-5.7mysql-5.6

Calculating rolling percentage change by category in MySQL 5.6


data table

I have a SQL data table like this, and I wanted to calculate the rolling percentage change (by row and category). So that the result looks like this below

result table

The SQL query I use is really slow and it takes forever to calculate it when there are thousands of categories, do you have an idea what's going on? Or any improvement?

First create a sample data_table:

CREATE TABLE IF NOT EXISTS data_table (
    id INT AUTO_INCREMENT,
    num INT,
    category VARCHAR(10),
    price FLOAT(20,2),
    PRIMARY KEY (id)
);

INSERT INTO data_table(num,category,price)
VALUES(1,"A","10"),
      (2,"A","20"),
      (3,"A","30"),
      (1,"B","20"),
      (2,"B","30"),
      (3,"B","40");

SQL for calculating percentage change:

SELECT 
     A.*, 
     CASE WHEN (A.price IS NULL OR B.price IS NULL OR B.price=0) THEN 0 ELSE
        (A.price - B.price)/(B.price) *100 END AS perc
FROM (SELECT
    num,
    category,
    price
  FROM data_table
  ) A LEFT JOIN (SELECT
    num,
    category,
    price
  FROM data_table
  ) B
ON (A.num = B.num+1) AND A.category=B.category;

Solution

  • You could use user variables -

    SELECT
        dt.*,
        IF(@prev_cat <> category, NULL, ROUND((price - @prev_price) / @prev_price * 100, 1)) AS perc,
        @prev_cat := category,
        @prev_price := price
    FROM data_table dt, (SELECT @prev_cat := 0, @prev_price := 0) vars
    ORDER BY category, num;
    

    If you want to update your table with this perc value you can use -

    ALTER TABLE `data_table` 
      ADD COLUMN `perc` DECIMAL(5,2) NULL AFTER `price`;
    
    UPDATE `test`.`data_table` dt
    JOIN (
        SELECT
            dt.*,
            IF(@prev_cat <> category, NULL, ROUND((price - @prev_price) / @prev_price * 100, 1)) AS perc_calc,
            @prev_cat := category,
            @prev_price := price
        FROM data_table dt, (SELECT @prev_cat := 0, @prev_price := 0) vars
        ORDER BY category, num
    ) z ON dt.id = z.id
    SET dt.perc = z.perc_calc;
    

    If you were on MySQL 8 this would be a bit easier with LAG() -

    SELECT dt.*,
        ROUND((price - LAG(price, 1) OVER (PARTITION BY category ORDER BY num ASC)) / LAG(price, 1) OVER (PARTITION BY category ORDER BY num ASC) * 100, 1) AS `prev1`,
        ROUND((price - LAG(price, 2) OVER (PARTITION BY category ORDER BY num ASC)) / LAG(price, 2) OVER (PARTITION BY category ORDER BY num ASC) * 100, 1) AS `prev2`,
        ROUND((price - LAG(price, 20) OVER (PARTITION BY category ORDER BY num ASC)) / LAG(price, 20) OVER (PARTITION BY category ORDER BY num ASC) * 100, 1) AS `prev20`
    FROM data_table dt;