Search code examples
sql-servert-sqlquery-optimization

How to optimize a query?


I have a T-SQL query that run during 6 minutes.

There is several sub-query on the same table in it. I think it is the cause of the problem. But I have no idea for optimizing it.

SELECT  dateheure, bac, presence, reponse
    ,(select top 1 dateheure from LogEvents where bac = t1.bac and dateheure < t1.dateheure order by id desc) as dateheure_precedente
    ,(select top 1 presence from LogEvents where bac = t1.bac and dateheure < t1.dateheure order by id desc) as presence_precedente 
    ,(select top 1 reponse from LogEvents where bac = t1.bac and dateheure < t1.dateheure order by id desc) as reponse_precedente   
    ,(select top 1 dateheure from LogEvents where bac = t1.bac and dateheure > t1.dateheure order by id asc) as dateheure_suivante
    ,(select top 1 presence from LogEvents where bac = t1.bac and dateheure > t1.dateheure order by id asc) as presence_suivante
    ,(select top 1 reponse from LogEvents where bac = t1.bac and dateheure > t1.dateheure order by id asc) as reponse_suivante
FROM [alpla_log].[dbo].[LogEvents] t1 
WHERE 
    t1.presence = 7845
    AND dateheure BETWEEN '11/07/2024 00:00:00' AND '11/07/2024 23:59:59.997' 
ORDER BY id DESC

The only table used is "LogEvents"

CREATE TABLE LogEvents (
    id int IDENTITY NOT NULL PRIMARY KEY,
    dateheure datetime NULL,
    type varchar(50) NULL,
    msg varchar(MAX) NULL,
    presence int NULL,
    destination int NULL,
    status_prt int NULL,
    reponse int NULL,
    bac int NULL
);

You can find the execution plan here.

I tried to add index on the table but it make no difference. CREATE NONCLUSTERED INDEX ON dbo.Logevents (bac, dateheure) INCLUDE (reponse)

There are no other index.

The table is log of events in an automatic warehouse. It save the position of boxes on differents places. I want to find the previous and next position of every box compared with a specific place.

EDIT :

With use of LEAD and LAG function, the query execution time is now around 500ms.


Solution

  • Your query can be significantly improved.

    • While you could replace the six subqueries with two APPLYs, a much better solution is to use LEAD and LAG window functions.
      • Note that LEAD and LAG will not give a result for the last/first row in the partition. You may need to put the whole thing in a derived table with a wider date range, then filter it back down afterwards.
    • The WHERE clause is not "sarge-able" (cannot use indexes). Instead use a date range, preferably a half-open interval >= AND <
    • Remove the ORDER BY if it's not absolutely necessary, as it's a different order from the WHERE and the PARTITION BY of the LEAD.
    SELECT
      dateheure,
      bac,
      presence,
      reponse,
      LAG(t1.dateheure) OVER (PARTITION BY t1.bac ORDER BY t1.dateheure, t1.id) as dateheure_precedente,
      LAG(t1.presence ) OVER (PARTITION BY t1.bac ORDER BY t1.dateheure, t1.id) as presence_precedente,
      LAG(t1.reponse  ) OVER (PARTITION BY t1.bac ORDER BY t1.dateheure, t1.id) as reponse_precedente,
      LEAD(t1.dateheure) OVER (PARTITION BY t1.bac ORDER BY t1.dateheure, t1.id) as dateheure_suivante,
      LEAD(t1.presence ) OVER (PARTITION BY t1.bac ORDER BY t1.dateheure, t1.id) as presence_suivante,
      LEAD(t1.reponse  ) OVER (PARTITION BY t1.bac ORDER BY t1.dateheure, t1.id) as reponse_suivante
    FROM LogEvents t1 
    WHERE 
        t1.presence = 7845
        AND dateheure >= '20240711'
        AND dateheure < '20240712';
    

    Finally add the correct index to support this query. The index you had was not taking presence into account.

    CREATE INDEX IX ON LogEvents (presence, bac, dateheure, id) INCLUDE (reponse);
    

    You can see from this fiddle that this now results in a single scan of the base table, no joins, no key lookups and no sorts.