Search code examples
sqlsql-serverperformancesql-order-bysqlperformance

Order by extreme performance impact


I have a complex query in SQL Server which take 70 seconds to complete with 3,000,000 records.
When I remove the two ORDER BY clauses, it just takes 0.5 seconds to complete.

SELECT TOP 100
    ROW_NUMBER() OVER ( ORDER BY format([DetectionDateTime],'yyyy/MM/dd') DESC , [DetectionTime] DESC) AS RowId,
    *
FROM (
    SELECT
        dbo.tblValidDetections.Id,
        dbo.tblCameras.CameraName, 
        TRY_CAST((TRY_CAST(dbo.tblValidDetections.DetectionDate as varchar)
            +  ' '
            + TRY_CAST(dbo.tblValidDetections.DetectionTime As varchar)) as datetime)
            as DetectionDateTime, 
        format(dbo.tblValidDetections.DetectionDate, 'yyyy/MM/dd', 'fa') AS DetectionDatePersian,
        dbo.tblValidDetections.DetectionTime,
        TRY_CAST(dbo.tblValidDetections.DetectionTime as varchar) AS DetectionTimeAsString,
        dbo.tblValidDetections.PelakCitySection,
        dbo.tblValidDetections.PelakRightSection,
        dbo.tblValidDetections.PelakAlphabetSection, 
        dbo.tblValidDetections.PelakLeftSection, 
        dbo.tblValidDetections.PelakCitySection + N' - ' + 
            dbo.tblValidDetections.PelakRightSection + N' ' + 
            dbo.tblValidDetections.PelakAlphabetSection + N' ' + 
            dbo.tblValidDetections.PelakLeftSection
            AS Pelak, 
        CASE
            WHEN dbo.tblValidDetections.Direction = 1 THEN N'In'
            WHEN dbo.tblValidDetections.Direction = 0 THEN N'Out'
            ELSE N'N/A'
            END AS CarDirection,
        dbo.__tbl_base_PelakTypesFarsi.pelakType, 
        dbo.__tbl_base_ProvinceList.ProvinceName, 
        dbo.__tbl_base_CityList.CityName
    FROM dbo.tblValidDetections 
    INNER JOIN dbo.tblCameras 
        ON dbo.tblValidDetections.DetectionCameraId = dbo.tblCameras.Id 
    INNER JOIN dbo.__tbl_base_PelakTypesFarsi 
        ON dbo.tblValidDetections.PelakTypeId = dbo.__tbl_base_PelakTypesFarsi.Id 
    LEFT OUTER JOIN dbo.__tbl_base_CityList 
        ON dbo.tblValidDetections.CityId = dbo.__tbl_base_CityList.Id 
    LEFT OUTER JOIN dbo.__tbl_base_ProvinceList 
        ON dbo.tblValidDetections.ProvinceId = dbo.__tbl_base_ProvinceList.Id
        AND dbo.__tbl_base_CityList.ProvinceId = dbo.__tbl_base_ProvinceList.Id
) As DetectionView  
ORDER BY
    format([DetectionDateTime],'yyyy/MM/dd') DESC,
    [DetectionTime] DESC

I've converted DetectionDate to the fa culture to get the Persian Calendar date, which I don't think has to do with performance here.

As @T-N great suggestion in a comment on his answer to use a single DateTime2 column , I redesigned my table as below

SELECT TOP 1000
    ROW_NUMBER() OVER (ORDER BY CONVERT(date, VD.DetectionDateTime) DESC , CONVERT(time, VD.DetectionDateTime) DESC) AS RowId,
    VD.Id,
    C.CameraName, 
    VD.DetectionDateTime, 
    CONVERT(date, VD.DetectionDateTime) as DetectionDate,
    CONVERT(time, VD.DetectionDateTime) as DetectionTime,
    format(VD.DetectionDateTime, 'yyyy/MM/dd', 'fa') AS DetectionDatePersian,
    CONVERT(varchar, CONVERT(time, VD.DetectionDateTime)) As DetectionTimeAsString,
    VD.PelakCitySection,
    VD.PelakRightSection,
    VD.PelakAlphabetSection, 
    VD.PelakLeftSection, 
    VD.PelakCitySection + N' - ' + 
        VD.PelakRightSection + N' ' + 
        VD.PelakAlphabetSection + N' ' + 
        VD.PelakLeftSection
        AS Pelak, 
    CASE
        WHEN VD.Direction = 1 THEN N'In'
        WHEN VD.Direction = 0 THEN N'Out'
        ELSE N'N/A'
        END AS CarDirection,
    PTF.pelakType, 
    PL.ProvinceName, 
    CL.CityName,
    VD.CarColor,
    VD.CarType,
    VD.CarSpeed
FROM dbo.tblValidDetections VD
INNER JOIN dbo.tblCameras C
    ON VD.DetectionCameraId = C.Id 
INNER JOIN dbo.__tbl_base_PelakTypesFarsi PTF
    ON VD.PelakTypeId = PTF.Id 
LEFT OUTER JOIN dbo.__tbl_base_CityList CL
    ON VD.CityId = CL.Id 
LEFT OUTER JOIN dbo.__tbl_base_ProvinceList PL
    ON VD.ProvinceId = PL.Id
    AND CL.ProvinceId = PL.Id

WHERE  DetectionDateTime >='2024-04-12 00:00:00' AND DetectionDateTime <= '2024-04-12 23:59:59'
ORDER BY CONVERT(date, VD.DetectionDateTime) DESC , CONVERT(time, VD.DetectionDateTime) DESC

It's working and the performance is not bad ( about 2 seconds to get the 1K records via 3M total records)

But I think it can be more efficient because I've used CONVERTs repeatedly.


Solution

  • As others have stated, the main problem is that you have used calculated values in your ORDER BY clauses - both for the main query and in the ROW_NUMBER() calculation. That eliminates any possibility that SQL server can use a index to efficiently limit the query results to the most recent 100 rows.

    So, the first thing to do is to replace those ORDER BY expressions with direct column references in both places - ORDER BY DetectionDate DESC, DetectionTime DESC. You also need to ensure that you have an appropriate index on tblValidDetections(DetectionDate, DetectionTime), or some index that starts with those two columns.

    As far as I can tell, the subselect is no longer needed, so it can be eliminated.

    Lastly, to reduce clutter and improve readability, defining table aliases and using them to qualify all column references is good practice.

    The result would be something like:

    SELECT TOP 100
        ROW_NUMBER() OVER (ORDER BY VD.DetectionDate DESC, VD.DetectionTime DESC) AS RowId,
        VD.Id,
        C.CameraName, 
        TRY_CAST((TRY_CAST(VD.DetectionDate as varchar)
            +  ' '
            + TRY_CAST(VD.DetectionTime As varchar)) as datetime)
            as DetectionDateTime, 
        format(VD.DetectionDate, 'yyyy/MM/dd', 'fa') AS DetectionDatePersian,
        VD.DetectionTime,
        TRY_CAST(VD.DetectionTime as varchar) AS DetectionTimeAsString,
        VD.PelakCitySection,
        VD.PelakRightSection,
        VD.PelakAlphabetSection, 
        VD.PelakLeftSection, 
        VD.PelakCitySection + N' - ' + 
            VD.PelakRightSection + N' ' + 
            VD.PelakAlphabetSection + N' ' + 
            VD.PelakLeftSection
            AS Pelak, 
        CASE
            WHEN VD.Direction = 1 THEN N'In'
            WHEN VD.Direction = 0 THEN N'Out'
            ELSE N'N/A'
            END AS CarDirection,
        PTF.pelakType, 
        PL.ProvinceName, 
        CL.CityName
    FROM dbo.tblValidDetections VD
    INNER JOIN dbo.tblCameras C
        ON VD.DetectionCameraId = C.Id 
    INNER JOIN dbo.__tbl_base_PelakTypesFarsi PTF
        ON VD.PelakTypeId = PTF.Id 
    LEFT OUTER JOIN dbo.__tbl_base_CityList CL
        ON VD.CityId = CL.Id 
    LEFT OUTER JOIN dbo.__tbl_base_ProvinceList PL
        ON VD.ProvinceId = PL.Id
        AND CL.ProvinceId = PL.Id
    ORDER BY VD.DetectionDate DESC, VD.DetectionTime DESC
    

    If you original DetectionView subquery was actually a VIEW (that you embedded in the query just for this question), you will need to modify that view to expose the original DetectionDate and DetectionTime columns. This would then allow the outer query to reference them as needed. Your final select list would need to explicitly pick the desired columns instead of using *, if you with to exclude those columns from the final results.