Search code examples
sqlsql-serverperformancequery-optimization

Slow query performance due to partitions


In SQL Server, query B performs hundreds of times faster than query A. There are only 12 partitions in total. How to make query B as simple as query A but keep the performance?

Relevant SQL statements:

CREATE PARTITION FUNCTION PartitionFunction (bigint)
AS RANGE RIGHT
FOR VALUES ( 
2401010000000000000,
2402010000000000000,
2403010000000000000,
2404010000000000000,
2405010000000000000,
2406010000000000000,
2407010000000000000,
2408010000000000000,
2409010000000000000,
2410010000000000000,
2411010000000000000,
2412010000000000000
)

CREATE PARTITION SCHEME PartitionScheme
AS PARTITION PartitionFunction 
ALL TO ([PRIMARY])

CREATE TABLE [dbo].[EmailIdx]
(
    [Email] [varchar](255) NOT NULL,
    [DateAndTime] [bigint] NOT NULL,
    [Serial] [bigint] NOT NULL,

    CONSTRAINT [PK_Email] 
        PRIMARY KEY CLUSTERED ([Email] ASC, [DateAndTime] ASC, [Serial] ASC)
                WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
                      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
                      ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) on PartitionScheme([DateAndTime])
)

Query A:

SELECT TOP 20
    Email,
    DateAndTime,
    Serial
FROM 
    EmailIdx WITH (NOLOCK)
WHERE
    Email = 'common_email@example.com'
    AND DateAndTime < 2412230000000000000
    AND DateAndTime > 2402110000000000000 
ORDER BY 
    DateAndTime DESC

Query B:

-- Union all partitions from February to December starting from December
WITH CombinedResults AS 
(
    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 12
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 11
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 10
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 9
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 8
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 7
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 6
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 5
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 4
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 3
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC

    UNION ALL    

    SELECT TOP 20
        Email,
        DateAndTime,
        Serial
    FROM 
        EmailIdx WITH (NOLOCK)
    WHERE
        $PARTITION.PartitionFunction(DateAndTime) = 2
        AND Email = 'common_email@example.com'
        AND DateAndTime < 2412230000000000000
        AND DateAndTime > 2402110000000000000 
    ORDER BY 
        DateAndTime DESC
)
SELECT TOP 20
    Email,
    DateAndTime,
    Serial
FROM 
    CombinedResults

Both queries return the correct results but the first one runs a very inefficient query plan that seems to miss the point that all dates in December partitions will be > November partition, etc, etc.


Solution

  • I have no idea why you aren't just using datetime2 here and are rolling your own scheme encoding datetimes as integers but anyway you should use

    WHERE
        Email = 'common_email@example.com'
        AND DateAndTime < CAST(2412230000000000000 AS BIGINT)
        AND DateAndTime > CAST(2402110000000000000 AS BIGINT)
    

    The literals 2412230000000000000 etc are decimal(19,0) not bigint as defined in the Partition Function and the data type mismatch is causing the extra operators to be added and more probability of a less efficient plan.

    With that in place I get the plan that only visits the 11 expected partitions numbered from 3 to 13 (as required by BETWEEN $PARTITION.PartitionFunction(2402110000000000000) AND $PARTITION.PartitionFunction(2412230000000000000)) and no sort operator

    enter image description here

    (As opposed to this plan with the mismatched datatypes)

    enter image description here