Clustered versus nonclustered index in a SQL Server table

Working in a datawarehouse environment (SQL Server 2008) and there are a few tables that have about 2 million rows and 20 columns. Each night, the tables are dropped and re-created. When they are made, indexes are also built. For some reason there is no clustered index on these tables. There are however, unique nonclustered indexes. Seems unlogical. Does anybody know any downsides to changing that all so that all these tables have a clustered index. Should save some space and perform better.

Any tips?

Thanks in advance.

Solution

Indeed clustered indexes have drawbacks too.

I think the most under-estimated drawback is what I refer to as clustered index penalty.

If you don't have any clustered index on a table, it means the table is stored as a heap table. All non-clustered indexes refer to that heap table.

The good thing about heap tables is that rows stored in them hardly ever move — unlike clustered indexes, in which every row can move at any time to a different physical location.

This difference affects non-clustered indexes as they refer to the rows in the heap or clustered index: in case of a heap, they can just store the physical location of that row in the non-clustered indexes (as they hardly ever change). If there is a clustered index, the non-clustered indexes store the clustering key.

If you are eventually using a non-clustered index, the effort to reach the actual table in either the heap or clustered index is very different: With a heap, you just need one physical IO, with a clustered index, you need to do a Clustered Index Seek, which is typically 3-5 logical IOs (depends on the table size).

If you have many non-clustered indexes and are using them without index-only scan (that means there follows a RID access), a clustered index may hurt performance considerably.

More details about that in this article I wrote:

http://use-the-index-luke.com/blog/2014-01/unreasonable-defaults-primary-key-clustering-key