Search code examples
sql-serverterminology

What is the difference between Lookup, Scan and Seek?


So I found this query

SELECT MAX(us.[last_user_lookup]) as [last_user_lookup], MAX(us.[last_user_scan]) 
AS [last_user_scan], MAX(us.[last_user_seek]) as [last_user_seek] 
from sys.dm_db_index_usage_stats as us 
where us.[database_id] = DB_ID() AND us.[object_id] = OBJECT_ID('tblName')
group by us.[database_id], us.[object_id];

when i look up the documentation on sys.dm_db_index_usage_stats all it says is

last_user_seek      datetime    Time of last user seek
last_user_scan      datetime    Time of last user scan.
last_user_lookup    datetime    Time of last user lookup.

...

Every individual seek, scan, lookup, or update on the specified index by one query execution is counted as a use of that index and increments the corresponding counter in this view. Information is reported both for operations caused by user-submitted queries, and for operations caused by internally generated queries, such as scans for gathering statistics.

Now I understand that when I run the query it's getting the highest time of those 3 fields as sys.dm_db_index_usage_stats can have duplicate database_id and object_id where one or more of the fields may also be NULL (so you can just to a SELECT TOP 1 ... ORDER BY last_user_seek, last_user_scan, last_user_lookup DESC otherwise you potentially miss data) but when I run it I get values like

NULL | 2017-05-15 08:56:29.260 | 2017-05-15 08:54:02.510

but I don't understand what the user has done with the table which is represented by these values.

So what is the difference between Lookup, Scan and Seek?


Solution

  • Basic difference between these operations:

    Let's consider that you have two tables. TableA and TableB. Both tables contain more than 1000 000 rows, and both have clustered indexes on Id column. TableB has also nonclustered index on code column. (Remember that your nonclustered index is always pointing at pages of clustered one...)

    seek:

    Let's consider that you want only 1 record from TableA and your clustered index is on column Id. Query should be like:

    SELECT Name
    FROM TableA
    WHERE Id = 1
    

    Your result contains fewer than 15% (it is something between 10-20, depends on situation) of your full data set... Sql Server performs index seek in this scenario. (optimizer has found a useful index to retrieve data)

    scan:

    For example your query needs more than 15% of data from TableA, then it is necessary to scan the whole index to satisfy the query. Let's consider that TableB has TableA Id column as foreign key from TableA, and TableB contains all Ids from TableA. Query should be like:

    SELECT a.Id
    FROM TableA a
    JOIN TableB b ON a.Id = b.TableAId
    

    Or just

    SELECT *
    FROM TableA
    

    For index on TableA SQL Server performs use index scan. Because all data (pages) need to satisfy the query...

    lookup:

    Let's consider that TableB has column dim and also column code and nonclustered index on code (as we mentioned). SQL Server will use lookup when it needs to retrieve non key data from the data page and nonclustered index is used to resolve the query. For example key lookup could be used in query like:

    SELECT id, dim
    FROM TableB
    WHERE code = 'codeX'
    
    • You can resolve it by covering index (include dim to nonclustered one)