Yesterday an interesting question was asked which required updating a MySQL table using the LAG. Consider the following input table (left), and the desired output (right):
**INPUT** **OUTPUT**
ID TestDate PerformanceStatus (PS) ID TestDate PS PreviousPerformanceStatus
1 15/03/2016 0 1 15/03/2016 0 0
1 01/04/2016 2 1 01/04/2016 2 0
1 05/05/2016 1 1 05/05/2016 1 2
1 07/06/2016 1 1 07/06/2016 1 1
2 15/03/2016 0 2 15/03/2016 0 1
2 01/04/2016 2 2 01/04/2016 2 0
2 05/05/2016 1 2 05/05/2016 1 2
2 07/06/2016 3 2 07/06/2016 3 1
2 23/08/2016 1 2 23/08/2016 1 3
In other words, the goal is to assign to PreviousPerformanceStatus
the value which existed in the record coming before, as ordered by ID
then TestDate
.
The accepted answer, given by @spencer7593, used a correlated subquery. However, what popped into my head first was to use a user variable. Here is how I answered:
SET @lag = 0;
UPDATE yourTable
SET PreviousPerformanceStatus = @lag,
@lag:=PerformanceStatus
ORDER BY ID, TestDate
I was told that this answer is unstable, but I was wondering if someone could explain why something might go wrong, what would be happening in that case, and finally what could we do to use a user variable here to simulate LAG.
It is my understanding that the following SELECT
query would have no issues at all:
SELECT PerformanceStatus,
@lag AS PreviousPerformanceStatus,
@lag:=PerformanceStatus
FROM yourTable
ORDER BY ID, TestDate
However, when doing an UPDATE
there are other considerations to take into account.
I don't think you can set variables in an update statement. Here's my reasoning- Given this
drop table if exists t;
create table t (ID int, TestDate date, PerformanceStatus int, previousperformancestatus int);
insert into t values
(1 , '2016-03-15' , 0, null),
(1 , '2016-04-01' , 2, null),
(1 , '2016-05-05' , 1, null),
(1 , '2016-06-07' , 1, null),
(2 , '2016-03-15' , 0, null),
(2 , '2016-04-01' , 2, null),
(2 , '2016-05-05' , 1, null),
(2 , '2016-06-07' , 3, null),
(2 , '2016-08-23' , 1, null)
;
This code fails
MariaDB [sandbox]> SET @lag = 0;
Query OK, 0 rows affected (0.00 sec)
MariaDB [sandbox]> UPDATE T
-> SET previousPerformanceStatus = @lag ,
-> @lag:=PerformanceStatus
-> ORDER BY ID, TestDate;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '@lag:=PerformanceStatus
ORDER BY ID, TestDate' at line 3
Commenting out @lag:=PerformanceStatus This code runs
MariaDB [sandbox]> SET @lag = 0;
Query OK, 0 rows affected (0.00 sec)
MariaDB [sandbox]> UPDATE T
-> SET previousPerformanceStatus = @lag
-> #,@lag:=PerformanceStatus
-> ORDER BY ID, TestDate;
Query OK, 0 rows affected (0.01 sec)
Rows matched: 9 Changed: 0 Warnings: 0
Since code at least runs without error and the manual https://dev.mysql.com/doc/refman/5.7/en/update.html states "The SET clause indicates which columns to modify " my take on this is that you cannot set variables in an update statement so simulating lag is not possible using this method.