Search code examples
sqlsql-servert-sqlrdbms

Sql Server - Fastest way to get number of times value changed for each ID


I'm writing sql queries to display different types of results as per following requirments :

  1. Display record who's value changed from 1 to 0, maximum times in given timestamp
  2. Display record who's value changed from 1 to 0, minimum times in given timestamp
  3. Display top 10 records who's value changed from 1 to 0, maximum times in given timestamp

Sample Data :

+----------+-------------+-------------+
| DeviceId | CaptureTime | SensorValue |
+----------+-------------+-------------+
|  DC001   | 02/01/2017  |      0      |
|  DC001   | 02/02/2017  |      1      |
|  DC001   | 02/03/2017  |      0      |
|  DC001   | 02/04/2017  |      1      |
|  DC001   | 02/05/2017  |      0      |
|  DC001   | 02/07/2017  |      1      |
|  DC001   | 02/08/2017  |      0      |
|  DC001   | 02/10/2017  |      1      |
|  DC001   | 02/01/2017  |      0      |
|  DC001   | 02/01/2017  |      0      |
|  DC002   | 02/02/2017  |      1      |
|  DC002   | 02/02/2017  |      0      |
|  DC002   | 02/02/2017  |      1      |
|  DC002   | 02/02/2017  |      1      |
|  DC002   | 02/02/2017  |      1      |
|  DC002   | 02/03/2017  |      1      |
|  DC002   | 02/03/2017  |      0      |
|  DC002   | 02/03/2017  |      0      |
|  DC002   | 02/03/2017  |      1      |
|  DC002   | 02/03/2017  |      1      |
|  DC003   | 02/03/2017  |      1      |
|  DC003   | 02/03/2017  |      1      |
|  DC003   | 02/03/2017  |      0      |
|  DC003   | 02/03/2017  |      1      |
|  DC003   | 02/03/2017  |      1      |
|  DC003   | 02/04/2017  |      1      |
|  DC003   | 02/05/2017  |      1      |
|  DC003   | 02/06/2017  |      1      |
|  DC003   | 02/07/2017  |      1      |
|  DC003   | 02/08/2017  |      1      |
|  DC004   | 02/09/2017  |      0      |
|  DC004   | 02/10/2017  |      0      |
|  DC004   | 02/11/2017  |      1      |
|  DC004   | 02/12/2017  |      0      |
|  DC004   | 02/12/2017  |      1      |
|  DC004   | 02/12/2017  |      1      |
|  DC004   | 02/12/2017  |      1      |
|  DC004   | 02/12/2017  |      1      |
|  DC004   | 02/12/2017  |      1      |
|  DC004   | 02/12/2017  |      1      |
|  DC005   | 02/12/2017  |      0      |
|  DC005   | 02/12/2017  |      0      |
|  DC005   | 02/12/2017  |      0      |
|  DC005   | 02/12/2017  |      0      |
|  DC005   | 02/14/2017  |      0      |
|  DC005   | 02/14/2017  |      0      |
|  DC005   | 02/14/2017  |      0      |
|  DC005   | 02/14/2017  |      0      |
|  DC005   | 02/14/2017  |      0      |
|  DC005   | 02/14/2017  |      0      |
+----------+-------------+-------------+

I have created bellow common query for all three requirements :

DECLARE @HoursBack  INT
        , @MinMax   VARCHAR(3)
        , @TopRows  INT

SELECT  TOP (@TopRows) COUNT(TD1.DeviceId) PickedNoOfTimes, ItemName -- I have removed table to get ItemName to simplify this query
FROM    tTrayDetails AS TD1
WHERE   TD1.SensorValue = 0
AND     TD1.CaptureTime > DATEADD(HOUR, -@HoursBack, GETDATE())
AND     TD1.SensorValue <> (
                                SELECT  TOP 1 SensorValue
                                FROM    tTrayDetails TD2
                                WHERE   TD2.CaptureTime < TD1.CaptureTime
                                ORDER BY TD2.CaptureTime DESC
                            )
GROUP BY    TD1.DeviceId
ORDER BY    CASE WHEN @MinMax = 'Max' THEN COUNT(TD1.DeviceId) END DESC
            , CASE WHEN @MinMax = 'Min' THEN COUNT(TD1.DeviceId) END ASC

This query works for all three requirements, by just setting different values of @HoursBack, @MinMax and @TopRows variables.

Here is the values to set for my three requirements :

  1. @HoursBack = 24, @MinMax='Max', @TopRows=1
  2. @HoursBack = 24, @MinMax='Min', @TopRows=1
  3. @HoursBack = 24, @MinMax='Max', @TopRows=10

Now Problem is : This query takes around 40 Sec to execute, only for 14K records on testing environment.

On production environment daily 2-4K records will get added, so this query execution time will increase.

How I can change the query to run faster with large amount of data.


Solution

  • This will count only those rows where SensorValue changed from 1 to 0:

    WITH cte AS
     (
       SELECT DeviceId,
          -- previous row = 1 and current row = 0
          CASE WHEN LAG(SensorValue)
                    Over (PARTITION BY DeviceId
                          ORDER BY CaptureTime) = 1
                    AND SensorValue = 0
               THEN 1
               ELSE 0
          END AS ChangeFlag
       FROM tTrayDetails AS t
       WHERE ....
     )
    SELECT DeviceId, Count(*)
    FROM cte
    WHERE ChangeFlag = 1
    GROUP BY DeviceId
    

    Now apply your TOP/ORDER BY ...