I've been trawling books online and google incantations trying to find out what fill factor physically is in a leaf-page (SQL Server 2000 and 2005).
I understand that its the amount of room left free on a page when an index is created, but what I've not found is how that space is actually left: i.e., is it one big chunk towards the end of the page, or is it several gaps through that data.
For example, [just to keep the things simple], assume a page can only hold 100 rows. If the fill-factor is stated to be 75%, does this mean that the first (or last) 75% of the page is data and the rest is free, or is every fourth row free (i.e., the page looks like: data, data, data, free, data, data, data, free, ...).
The long and short of this is that I'm getting a handle on exactly what happens in terms of physical operations that occur when inserting a row into a table with a clustered index, and the insert isn't happening at the end of the row. If multiple gaps are left throught a page, then an insert has minimal impact (at least until a page split) as the number of rows that may need to be moved to accomodate the insert is minimised. If the gap is in one big chunk in the table, then the overhead to juggle the rows around would (in theory at least) be significantly more.
If someone knows an MSDN reference, point me to it please! I can't find one at the moment (still looking though). From what I've read it's implied that it's many gaps - but this doesn't seem to be explicitly stated.
From MSDN
:
The fill-factor setting applies only when the index is created, or rebuilt. The
SQL Server Database Engine
does not dynamically keep the specified percentage of empty space in the pages. Trying to maintain the extra space on the data pages would defeat the purpose of fill factor because theDatabase Engine
would have to perform page splits to maintain the percentage of free space specified by the fill factor on each page as data is entered.
and, further:
When a new row is added to a full index page, the
Database Engine
moves approximately half the rows to a new page to make room for the new row. This reorganization is known as a page split. A page split makes room for new records, but can take time to perform and is a resource intensive operation. Also, it can cause fragmentation that causes increasedI/O
operations. When frequent page splits occur, the index can be rebuilt by using a new or existing fill factor value to redistribute the data.
SQL Server
's data page consists of the following elements:
Page header
: 96
bytes, fixed.Data
: variableRow offset array
: variable.The row offset array is always stored at the end of the page and grows backwards.
Each element of the array is the 2
-byte value holding the offset to the beginning of each row within the page.
Rows are not ordered within the data page: instead, their order (in case of clustered storage) is determined by the row offset array. It's the row offsets that are sorted.
Say, if we insert a 100
-byte row with cluster key value of 10
into a clustered table and it goes into a free page, it gets inserted as following:
[00 - 95 ] Header
[96 - 195 ] Row 10
[196 - 8190 ] Free space
[8190 - 8191 ] Row offset array: [96]
Then we insert a new row into the same page, this time with the cluster key value of 9
:
[00 - 95 ] Header
[96 - 195 ] Row 10
[196 - 295 ] Row 9
[296 - 8188 ] Free space
[8188 - 8191 ] Row offset array: [196] [96]
The row is prepended logically but appended physically.
The offset array is reordered to reflect the logical order of the rows.
Given this, we can easily see that the rows are appended to the free space, starting from the beginning on the page, while pointers to the rows are prepended to the free space starting from the end of the page.