Search code examples
sqlsql-serverquery-optimizationcross-applyouter-apply

Optimize query with multiple OUTER APPLY


I'm having query with multiple OUTER APPLY but all tables have primary keys on joined columns (so clustered indexes are used here) so I don't know how to optimize this query futher. Also it's impossible to use indexed view here because using ORDER BY and TOP is forbidden for them.

So I'm having

  • Table Fields with Id primary key and big variety of other columns.

  • WeatherHistory table with complex primary key (FieldId and [Date]) and a lot of columns,

  • NdviImageHistory table with FieldId, [Date], [Base64] columns (complex primary key FieldId and [Date]) where [Base64] stores image base64,

  • NaturalColorImageHistory table with FieldId, [Date], [Base64] columns (complex primary key FieldId and [Date]) where [Base64] stores image base64,

  • NdviHistory table with with FieldId, [Date], MeanNdvi columns (complex primary key FieldId and [Date]),

  • FieldSeasonHistory table with Field, StartDate, EndDate columns (complex primary key FieldId and [Date]).

My query

SELECT Fields.*,
    WeatherHistory.TempSumC AS CurrentTempSumC,
    TempSumF AS CurrentTempSumF,
    PrecipitationSumMm AS CurrentPrecipitationSumMm,
    nih.[Base64] AS CurrentNdviImageBase64,
    ncih.[Base64] AS CurrentNaturalColorImageBase64,
    MeanNdvi AS CurrentMeanNdvi,
    IsOpenSeason
FROM Fields
LEFT JOIN WeatherHistory ON FieldId = Id AND [Date] = CAST(GETUTCDATE() AS DATE)
OUTER APPLY
(
    SELECT TOP 1 [Base64]
    FROM NdviImageHistory
    WHERE FieldId = Id
    ORDER BY [Date] DESC
) nih
OUTER APPLY
(
    SELECT TOP 1 [Base64]
    FROM NaturalColorImageHistory
    WHERE FieldId = Id
    ORDER BY [Date] DESC
) ncih
OUTER APPLY
(
    SELECT TOP 1 MeanNdvi
    FROM NdviHistory
    WHERE FieldId = Id
    ORDER BY [Date] DESC
) nh
OUTER APPLY
(
    SELECT TOP 1 CASE WHEN EndDate IS NULL THEN 1 ELSE 0 END AS IsOpenSeason
    FROM FieldSeasonHistory
    WHERE FieldId = Id
    ORDER BY [StartDate] DESC
) fsh
WHERE UserId = (SELECT Id FROM Users WHERE Email = @email) AND IsArchived = 0

I haven't created any indexes because I suppose that autogenerated clusted indexes (based on primary keys) should be enough (but I can be wrong). This query is executed about 15 seconds, but I want to reduce query time.


Edit: Added indexes to UserId and IsArchived columns of Fields table. Query execution plan:

enter image description here


** Edit 2:** Statistics:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

(13 row(s) affected)
Table 'FieldSeasonHistory'. Scan count 13, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'NdviHistory'. Scan count 13, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'NaturalColorImageHistory'. Scan count 13, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 604, lob physical reads 0, lob read-ahead reads 0.
Table 'NdviImageHistory'. Scan count 13, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 68, lob physical reads 0, lob read-ahead reads 0.
Table 'WeatherHistory'. Scan count 0, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Users'. Scan count 0, logical reads 228, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Fields'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 16 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

Solution

  • You need multicolumn indexes for each of the tables. The index should be the column in the where, the column in the order by and then the column in the select. For instance:

    • NdviImageHistory(FieldId, [Date], [Base64])
    • NaturalColorImageHistory(FieldId, [Date], [Base64])
    • and so on.