Search code examples
sql-serverpartitioningclustered-indexcolumnstorefilegroup

Single Column Huge table (2.5 B rows). Clustered index Vs Clustered Columnstore index


We are having a huge table Table1(2.5 billion rows) with single column A(NVARCHAR(255) datatype). What is the right approach for seek operations against this table. Clustered index on A Vs Clustered Column store index on A.

We are already keeping this table in separate filegroup from the other table Table2, with which it will be Joined.

Do you suggest partitioning this table for better performance ? This column will have unicode data also. So, what kind of partitioning approach is fine for unicode datatype ?

UPDATE: To clarify further, the use case for the table is SEEK. The table is storing identifiers for individuals. The major concerns here are performance for SEEK in the case of huge table. This table will be referred inside a transaction. We want the transaction to be short.


Solution

  • Clustered index vs column store index depends on the use case for the table. Column store keeps track of unique entries in the column and the rows where those entries are stored. This makes it very useful for data warehousing tasks such as aggregates against the indexed columns, however not as optimal for transactional tasks that need to pull a small number of specific rows. If you are using SQL Server 2014 or later you can use both a clustered index and a columnstore index by creating a clustered columnstore index. It does have some limitations and overhead that you should read up on though.

    Given that this is a seek for specific rows and not an aggregation of the column, I would recommend a clustered index instead of a column store index.