Search code examples
sqlindexingclustered-index

How are the table data stored when it has a clustered index


I have found umpteen posts which begin like Quite a lot of time I have come across people saying "Clustered Index Physically sorts the data inside the table based on the Clustered Index Keys". It's not true! Then such posts go on to describe how it is actually stored, via linked lists or whatever. For example, this post says that

Each Index row contains a Key value and a pointer to either an Intermediate level page in the B-tree, or a Data row in the Leaf level of the Index. The Pages in each level of the Index are linked in a Doubly-linked list. The Pages in the Data chain and the rows in them are ordered on the value of the Clustered Index key.

That brings me to my question, the data pages are the place where the table data are stored, right? So if they are sorted and the data within them also are sorted according to the indexed column value, why is it wrong to say that the clustered index keeps the table data in sorted order? Here is a pic from Kalen Delaney's book, which shows that the leaf pages in a table with CI are all sorted according to the CI value:

enter image description here


Solution

  • You're right.

    Clustered indexes do not physically sort the data inside the table based on the Clustered Index Keys. If that was the case then the inserts into the middle of a large table with no free space would require huge amounts of IO to make room for the new record.

    Instead a new page is allocated from anywhere in the file and linked into the linked list.

    The degree to which the physical order of pages differs from the logical order is the extent of logical fragmentation. Rebuilding or reorganizing the index can reduce this.