Search code examples
sqlquery-optimizationnon-clustered-index

SQL query optimization: get extra column data from non-clustered index


I am trying to write a query on a table that receives millions of records per day. I can narrow my query down to a time slice (logdate), but I need additional column data from it (num). Here is a sample query I'm using to test it:

DECLARE @StartTimeStamp DATETIME = '12/6/2019 7:56:50.799'
DECLARE @EndTimeStamp DATETIME = '12/6/2019 7:56:50.8'

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

SELECT tx.num, tx.logdate 
FROM hsi.transactionxlog tx
WHERE tx.logdate BETWEEN @StartTimeStamp AND @EndTimeStamp

This particular test, with a time span of .001 seconds, takes over four minutes to run. If I change it to a timeframe with no records found in the specified timeframe, then it takes almost perhaps one second to run, even specifying a span of 24 hours.

This table only has non-clustered indexes. One such index has the following columns in it: (num, logdate, and action, in that order).

How can I find the num corresponding to each record between @StartTimeStamp and @EndTimeStamp quickly? I would strongly prefer not to create additional indexes on this table, since many other application use it so often.


Solution

  • I found a temp table solution. Here is the essence of the solution:

    DECLARE @StartTimeStamp DATETIME = '12/6/2019 7:56:50.799'
    DECLARE @EndTimeStamp DATETIME = '12/6/2019 7:56:50.8'
    DECLARE @TempTable TABLE (logdate DATETIME, action BIGINT)
    
    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
    INSERT INTO @TempTable
        SELECT logdate, action FROM hsi.transactionxlog
        WHERE logdate BETWEEN @StartTimeStamp AND @EndTimeStamp
    
    SELECT tx.num, tx.logdate 
    FROM hsi.transactionxlog tx
    INNER JOIN @TempTable t ON t.logdate = tx.logdate AND t.action = tx.action
    WHERE tx.logdate BETWEEN @StartTimeStamp AND @EndTimeStamp
    

    I don't really have a good explanation for why this works, but it is much faster, and time scales properly with the time difference between @StartTimeStamp and @EndTimeStamp. It simply selects the few thousand records, then for some reason it's easier for SQL to find them in the large table.

    Thank you for looking at the question and trying to answer.