Search code examples
sqlsql-serverpaginationkeyset-pagination

Is there any better option to apply pagination without applying OFFSET in SQL Server?


I want to apply pagination on a table with huge data. All I want to know a better option than using OFFSET in SQL Server.

Here is my simple query:

SELECT *
FROM TableName
ORDER BY Id DESC 
OFFSET 30000000 ROWS
FETCH NEXT 20 ROWS ONLY

Solution

  • You can use Keyset Pagination for this. It's far more efficient than using Rowset Pagination (paging by row number).

    In Rowset Pagination, all previous rows must be read, before being able to read the next page. Whereas in Keyset Pagination, the server can jump immediately to the correct place in the index, so no extra rows are read that do not need to be.

    For this to perform well, you need to have a unique index on that key, which includes any other columns you need to query.

    In this type of pagination, you cannot jump to a specific page number. You jump to a specific key and read from there. So you need to save the unique ID of page you are on and skip to the next. Alternatively, you could calculate or estimate a starting point for each page up-front.

    One big benefit, apart from the obvious efficiency gain, is avoiding the "missing row" problem when paginating, caused by rows being removed from previously read pages. This does not happen when paginating by key, because the key does not change.


    Here is an example:

    Let us assume you have a table called TableName with an index on Id, and you want to start at the latest Id value and work backwards.

    You begin with:

    SELECT TOP (@numRows)
      *
    FROM TableName
    ORDER BY Id DESC;
    

    Note the use of ORDER BY to ensure the order is correct

    In some RDBMSs you need LIMIT instead of TOP

    The client will hold the last received Id value (the lowest in this case). On the next request, you jump to that key and carry on:

    SELECT TOP (@numRows)
      *
    FROM TableName
    WHERE Id < @lastId
    ORDER BY Id DESC;
    

    Note the use of < not <=

    In case you were wondering, in a typical B-Tree+ index, the row with the indicated ID is not read, it's the row after it that's read.


    The key chosen must be unique, so if you are paging by a non-unique column then you must add a second column to both ORDER BY and WHERE. You would need an index on OtherColumn, Id for example, to support this type of query. Don't forget INCLUDE columns on the index.

    SQL Server does not support row/tuple comparators, so you cannot do (OtherColumn, Id) < (@lastOther, @lastId) (this is however supported in PostgreSQL, MySQL, MariaDB and SQLite).

    Instead you need the following:

    SELECT TOP (@numRows)
      *
    FROM TableName
    WHERE (
        (OtherColumn = @lastOther AND Id < @lastId)
        OR OtherColumn < @lastOther
    )
    ORDER BY
      OtherColumn DESC,
      Id DESC;
    

    This is more efficient than it looks, as SQL Server can convert this into a proper < over both values.

    The presence of NULLs complicates things further. You may want to query those rows separately.