Search code examples
sqlsql-serverazure-sql-databasecolumnstore

Columnstore index - slow performance on offset-fetch query


We have Fact table around 35M rows on Azure database (premium tier), this table has cluster columnstore index enabled in order to boost query performance.

We did pagination (to index on Elastic Search) on Fact table using similar below code:

SELECT *
FROM [SPENDBY].[FactInvoiceDetail]
ORder by id
offset 1000000 rows fetch next 1000 rows only

But this query performs so slow, even over 10 minutes, it's not finished. If we change to use TOP, it works really well and take around 30 seconds:

SELECT TOP 1000 * 
FROM [SPENDBY].[FactInvoiceDetail]
WHERE ID > 1000000 
ORDER BY Id

The estimated execution plan for offset-fetch query:

enter image description here

I am not sure that I understand whether offset-fetch query performs very poorly on cluster columnstore index or not.

This table also have a lot of none-cluster B-tree indexes on foreign keys and one unique index on the Id of Fact table in order to boost performance

This execution plan for offset-fetch query:

https://pastebin.com/BM8MXQMg


Solution

  • There are a few issues here.

    1) Ordering BTree index is not a covering index for the paging query.
    
    2) The rows must be reconstructed from the CCI.
    
    3) The offset is large.
    

    Paging queries need a BTree index on the ordering columns to calculate which rows should be returned, and if that BTree index doesn't include all the requested columns, then a row lookup be required for each row. This is the "Nested Loops" operator in the query plan.

    But the rows are stored in a CCI which means that each column is in a separate data structure, and reading a single row requires one Logical IO for each column, for each row. That's why this query is especially expensive. And why a CCI is a poor choice for a paging query. A Clustered Index on the ordering columns, or a non-clustered index on the ordering columns with the remaining requested columns included would be much better.

    A secondary and smaller issue here is the large offset. SQL has to skip over the offset rows, counting them at it goes. So this will reading the first N pages of the BTree leaf level pages to skip over rows.