Search code examples
sqlsql-servert-sqllag

Return only the modified records in a table, using T-SQL


I'm attempting to return only the modified records in a table, using T-SQL.

This is what I've done thus far:

 BEGIN
    IF OBJECT_ID('tempdb..#Test') IS NOT NULL
        DROP TABLE #Test

    CREATE TABLE #Test
        (SetName nvarchar(100),
        [Timestamp] datetime,
        Value smallint)

    INSERT INTO #Test VALUES('Alpha', GETDATE(), 1)
    INSERT INTO #Test VALUES('Alpha', GETDATE(), 0)
    INSERT INTO #Test VALUES('Alpha', GETDATE(), 1)
    INSERT INTO #Test VALUES('Beta', GETDATE(), 1)
    INSERT INTO #Test VALUES('Beta', GETDATE(), 1)
    INSERT INTO #Test VALUES('Beta', GETDATE(), 1)
    INSERT INTO #Test VALUES('Gamma', GETDATE(), 1)
    INSERT INTO #Test VALUES('Gamma', GETDATE(), 0)
    INSERT INTO #Test VALUES('Gamma', GETDATE(), 1)

    SELECT * FROM #Test

END;

Results:

SetName Timestamp               Value

Alpha   2022-05-23 12:58:41.100 1
Alpha   2022-05-23 12:58:41.101 0
Alpha   2022-05-23 12:58:41.102 1
Beta    2022-05-23 12:58:41.103 1
Beta    2022-05-23 12:58:41.104 1
Beta    2022-05-23 12:58:41.105 1
Gamma   2022-05-23 12:58:41.106 1
Gamma   2022-05-23 12:58:41.107 0
Gamma   2022-05-23 12:58:41.108 1

Expected results:

Alpha   2022-05-23 12:58:41.101 0 -- Changed from 1 to 0
Alpha   2022-05-23 12:58:41.102 1 -- Changed from 0 to 1
Gamma   2022-05-23 12:58:41.107 0 -- Changed from 1 to 0
Gamma   2022-05-23 12:58:41.108 1 -- Changed from 0 to 1

The following statement returns all the 0 to 1, and 1 to 0 records and I don't understand why:

;WITH cte AS
(
SELECT 
    SetName, [Timestamp], Value, lag(Value, 1, -1) OVER (ORDER BY [Timestamp]) AS LastValue
FROM #Test
) 

SELECT 
    SetName, [Timestamp], [Value]
FROM
    cte
WHERE value <> LastValue

Solution

  • To track the changes at whole dataset, you were almost there, you just need to remove the first row by using LastValue <> -1.

    WITH CTE AS
    (
    SELECT 
        SetName, [Timestamp], Value, lag(Value, 1, -1) OVER (ORDER BY [Timestamp]) AS LastValue
    FROM #Test
    ) 
    
    SELECT 
        SetName, [Timestamp], [Value]
    FROM
        CTE
    WHERE value<>Lastvalue
    AND Lastvalue<> -1; --Add this filter to remove the rows which doesn't have any Lag Value
    

    To track the changes at SetName level.

    WITH CTE AS
    (
    SELECT 
       SetName, [Timestamp], Value, Lag(Value, 1, -1) OVER (PARTITION BY SetName ORDER BY [Timestamp]) AS LastValue
    FROM #Test
    ) 
    
    SELECT 
        SetName, [Timestamp], [Value]
    FROM
        CTE
    WHERE Value<>LastValue
    AND LastValue <> -1;
    

    db<>fiddle: Try here

    Note: From the given dataset, modified records at SetName level and at whole dataset is same.

    Assumption: Each Timestamp value is different.