I have a SQL data table like this, and I wanted to calculate the rolling percentage change (by row and category). So that the result looks like this below
The SQL query I use is really slow and it takes forever to calculate it when there are thousands of categories, do you have an idea what's going on? Or any improvement?
First create a sample data_table:
CREATE TABLE IF NOT EXISTS data_table (
id INT AUTO_INCREMENT,
num INT,
category VARCHAR(10),
price FLOAT(20,2),
PRIMARY KEY (id)
);
INSERT INTO data_table(num,category,price)
VALUES(1,"A","10"),
(2,"A","20"),
(3,"A","30"),
(1,"B","20"),
(2,"B","30"),
(3,"B","40");
SQL for calculating percentage change:
SELECT
A.*,
CASE WHEN (A.price IS NULL OR B.price IS NULL OR B.price=0) THEN 0 ELSE
(A.price - B.price)/(B.price) *100 END AS perc
FROM (SELECT
num,
category,
price
FROM data_table
) A LEFT JOIN (SELECT
num,
category,
price
FROM data_table
) B
ON (A.num = B.num+1) AND A.category=B.category;
You could use user variables -
SELECT
dt.*,
IF(@prev_cat <> category, NULL, ROUND((price - @prev_price) / @prev_price * 100, 1)) AS perc,
@prev_cat := category,
@prev_price := price
FROM data_table dt, (SELECT @prev_cat := 0, @prev_price := 0) vars
ORDER BY category, num;
If you want to update your table with this perc value you can use -
ALTER TABLE `data_table`
ADD COLUMN `perc` DECIMAL(5,2) NULL AFTER `price`;
UPDATE `test`.`data_table` dt
JOIN (
SELECT
dt.*,
IF(@prev_cat <> category, NULL, ROUND((price - @prev_price) / @prev_price * 100, 1)) AS perc_calc,
@prev_cat := category,
@prev_price := price
FROM data_table dt, (SELECT @prev_cat := 0, @prev_price := 0) vars
ORDER BY category, num
) z ON dt.id = z.id
SET dt.perc = z.perc_calc;
If you were on MySQL 8 this would be a bit easier with LAG() -
SELECT dt.*,
ROUND((price - LAG(price, 1) OVER (PARTITION BY category ORDER BY num ASC)) / LAG(price, 1) OVER (PARTITION BY category ORDER BY num ASC) * 100, 1) AS `prev1`,
ROUND((price - LAG(price, 2) OVER (PARTITION BY category ORDER BY num ASC)) / LAG(price, 2) OVER (PARTITION BY category ORDER BY num ASC) * 100, 1) AS `prev2`,
ROUND((price - LAG(price, 20) OVER (PARTITION BY category ORDER BY num ASC)) / LAG(price, 20) OVER (PARTITION BY category ORDER BY num ASC) * 100, 1) AS `prev20`
FROM data_table dt;