Search code examples
sql-servert-sqlsql-server-express

TSQL Performance issues using DATEADD in where clause


I have a query using the DATEADD method which takes a lot of time. I'll try to simplify what we do. We are monitoring tempretures and every 5 minutes we store the highest temp and lowest temp in table A

Date          |  Time      | MaxTemp | MinTemp
2011-09-18 | 12:05:00 | 38.15        | 38.099
2011-09-18 | 12:10:00 | 38.20        | 38.10
2011-09-18 | 12:15:00 | 38.22        | 38.17
2011-09-18 | 12:20:00 | 38.21        | 38.20
...
2011-09-19 | 11:50:00 | 38.17        | 38.10
2011-09-19 | 12:55:00 | 38.32        | 38.27
2011-09-19 | 12:00:00 | 38.30        | 38.20

Date/Time columns are of type date/time (and not datetime)

In another table (Table B) we store some data for the entire day, where a day is from NOON (12PM) to noon (not midnight to midnight).

So table B columns include: Date (date only no time)
ShiftManager
MaxTemp (this is the max temp for the entire 24 hours starting at that date noon till next day noon)
MinTemp

I get table B with all the data and just need to update the MaxTemp and MinTemp using table A
For example:For 09/18/2011 I need the maximum temp reading that was between 09/18/2011 12PM and 09/19/2011 12PM.
In the TableA sample we have above, the returend result would be 38.32 as it is the MAX(MaxTemp) for the desired period.

The SQL I'm using:

update TableB
set MaxTemp = (
select MAX(HighTemp) from TableA
where
(Date=TableB.Date and Time > '12:00:00') 
or 
(Date=DATEADD(dd,1,TableB.Date) and Time <= '12:00:00')
)

And it takes a lot of time (if I remove the DATEADD method it is quick).

Here is a simplified sample that shows the data I have and the expected result:

DECLARE @TableA TABLE ([Date] DATE, [Time] TIME(0), HighTemp DECIMAL(6,2));
DECLARE @TableB TABLE ([Date] DATE, MaxTemp DECIMAL(6,2));

INSERT @TableA VALUES
('2011-09-18','12:05:00',38.15),
('2011-09-18','12:10:00',38.20),
('2011-09-18','12:15:00',38.22),
('2011-09-19','11:50:00',38.17),
('2011-09-19','11:55:00',38.32),
('2011-09-19','12:00:00',38.31),
('2011-09-19','12:05:00',38.33),
('2011-09-19','12:10:00',38.40),
('2011-09-19','12:15:00',38.12),
('2011-09-20','11:50:00',38.27),
('2011-09-20','11:55:00',38.42),
('2011-09-20','12:00:00',38.16);

INSERT @TableB VALUES
('2011-09-18', 0),
('2011-09-19', 0);

-- This is how I get the data, now I just need to update the max temp for each day

with TableB(d, maxt) as
(
select * from @TableB
)
update TableB
set maxt = ( 
select MAX(HighTemp) from @TableA 
where
(Date=TableB.d and Time > '12:00:00')
or
(Date=DATEADD(dd,1,TableB.d) and Time <= '12:00:00')
)

select * from @TableB

Hope I was able to explian myself, any ideas how can I do it differently? Thx!


Solution

  • This would probably be a lot easier if you used a single SMALLDATETIME column instead of separating this data into DATE/TIME columns. Also I'm assuming you are using SQL Server 2008 and not a previous version where you're storing DATE/TIME data as strings. Please specify the version of SQL Server and the actual data types being used.

    DECLARE @d TABLE ([Date] DATE, [Time] TIME(0), MaxTemp DECIMAL(6,3), MinTemp DECIMAL(6,3));
    
    INSERT @d VALUES
    ('2011-09-18','12:05:00',38.15,38.099),
    ('2011-09-18','12:10:00',38.20,38.10),
    ('2011-09-18','12:15:00',38.22,38.17),
    ('2011-09-18','12:20:00',38.21,38.20),
    ('2011-09-19','11:50:00',38.17,38.10),
    ('2011-09-19','12:55:00',38.32,38.27),
    ('2011-09-19','12:00:00',38.30,38.20);
    
    SELECT '-- before update';
    SELECT * FROM @d;
    
    ;WITH d(d,t,dtr,maxt) AS
    (
        SELECT [Date], [Time], DATEADD(HOUR, -12, CONVERT(SMALLDATETIME, CONVERT(CHAR(8), 
            [Date], 112) + ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp FROM @d 
    ),
    d2(dtr, maxt) AS 
    (
        SELECT CONVERT([Date], dtr), MAX(maxt) FROM d
        GROUP BY CONVERT([Date], dtr)
    )
    UPDATE d SET maxt = d2.maxt FROM d
        INNER JOIN d2 ON d.dtr >= d2.dtr AND d.dtr < DATEADD(DAY, 1, d2.dtr);
    
    SELECT '-- after update';
    SELECT * FROM @d;
    

    Results:

    -- before update
    
    2011-09-18  12:05:00    38.150  38.099
    2011-09-18  12:10:00    38.200  38.100
    2011-09-18  12:15:00    38.220  38.170
    2011-09-18  12:20:00    38.210  38.200
    2011-09-19  11:50:00    38.170  38.100
    2011-09-19  12:55:00    38.320  38.270
    2011-09-19  12:00:00    38.300  38.200
    
    -- after update
    
    2011-09-18  12:05:00    38.220  38.099
    2011-09-18  12:10:00    38.220  38.100
    2011-09-18  12:15:00    38.220  38.170
    2011-09-18  12:20:00    38.220  38.200
    2011-09-19  11:50:00    38.220  38.100
    2011-09-19  12:55:00    38.320  38.270
    2011-09-19  12:00:00    38.320  38.200
    

    Presumably you want to update the MinTemp as well, and that would just be:

    ;WITH d(d,t,dtr,maxt,mint) AS
    (
        SELECT [Date], [Time], DATEADD(HOUR, -12,
             CONVERT(SMALLDATETIME, CONVERT(CHAR(8), [Date], 112) 
             + ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp, MaxTemp
        FROM @d 
    ),
    d2(dtr, maxt, mint) AS 
    (
        SELECT CONVERT([Date], dtr), MAX(maxt), MIN(mint) FROM d
        GROUP BY CONVERT([Date], dtr)
    )
    UPDATE d
        SET maxt = d2.maxt, mint = d2.maxt
        FROM d
        INNER JOIN d2
            ON d.dtr >= d2.dtr
            AND d.dtr < DATEADD(DAY, 1, d2.dtr);
    

    Now, this is not really better than your existing query, because it's still going to be using scans to figure out aggregates and all the rows that need to be updating. I'm not saying you should be updating the table at all, because this information can always be derived at query time, but if it is something you really want to do, I would combine the advice in these answers and consider revising the schema. For example, if the schema were:

    USE [tempdb];
    GO
    
    CREATE TABLE dbo.d
    (
        [Date] SMALLDATETIME, 
        MaxTemp DECIMAL(6,3), 
        MinTemp DECIMAL(6,3),
        RoundedDate AS (CONVERT(DATE, DATEADD(HOUR, -12, [Date]))) PERSISTED
    );
    
    CREATE INDEX rd ON dbo.d(RoundedDate);
    
    INSERT dbo.d([Date],MaxTemp,MinTemp) VALUES
    ('2011-09-18 12:05:00',38.15,38.099),
    ('2011-09-18 12:10:00',38.20,38.10),
    ('2011-09-18 12:15:00',38.22,38.17),
    ('2011-09-18 12:20:00',38.21,38.20),
    ('2011-09-19 11:50:00',38.17,38.10),
    ('2011-09-19 12:55:00',38.32,38.27),
    ('2011-09-19 12:00:00',38.30,38.20);
    

    Then your update is this simple, and the plan is much nicer:

    ;WITH g(RoundedDate,MaxTemp)
    AS
    (
        SELECT RoundedDate, MAX(MaxTemp)
            FROM dbo.d
            GROUP BY RoundedDate
    )
    UPDATE d
        SET MaxTemp = g.MaxTemp
        FROM dbo.d AS d
        INNER JOIN g
        ON d.RoundedDate = g.RoundedDate;
    

    Finally, one of the reasons your existing query is probably taking so long is that you are updating all of time, every time. Is data from last week changing? Probably not. So why not limit the WHERE clause to recent data only? I see no need to go recalculate anything earlier than yesterday unless you are constantly receiving revised estimates of how warm it was last Tuesday at noon. So why are there no WHERE clauses on your current query, to limit the date range where it is attempting to do this work? Do you really want to update the WHOLE able, EVERY time? This is probably something you should only be doing once a day, sometime in the afternoon, to update yesterday. So whether it takes 2 seconds or 2.5 seconds shouldn't really matter.