For example we have table with clustered index table1 (col1 int , col2 int , col3 int), clustered index defined for col1, non-clustered index defined for col2, we write query - select * from table1 where col2 = 'some value' - optimizer comes to non-clustered index seek, grabs specific row and goes to clustered index to grab rest of data that not contained inside of non-clustered index (performs key lookup to grab col3 in this case). Key lookup essentially is clustered index seek based on clustered index value that was found in non-clustered index on leaf level.
What happens when we don't have clustered index on table but we have non-clustered index? I know that it will perform RID lookup but how this logically works? Clustered index value will be found using b-tree search as we have all rows in particular order but how this will be found in heap table without any particular order? Per my understanding when we found row inside non-clustered index (this has non-clustered index key + rowid) we need to scan whole heap table to find this rowid there because we do not have any order there and can not navigate this row using b-tree , is it correct?
An example may help. Below creates a heap and inserts a couple of rows to it
CREATE TABLE Demo(X CHAR(1));
INSERT INTO Demo VALUES ('A'), ('B');
You can then use the following to see the RID
SELECT X,
%%physloc%% as rid,
sys.fn_PhysLocFormatter(%%physloc%%) as formatted
FROM Demo
Which for me returns
+---+--------------------+-----------+
| X | rid | formatted |
+---+--------------------+-----------+
| A | 0xE700000001000000 | (1:231:0) |
| B | 0xE700000001000100 | (1:231:1) |
+---+--------------------+-----------+
The RID is an 8 byte binary value which is a concatenation of three components. FileNumber:PageNumber:SlotNumber
.
Each file is divided up into 8KB pages with numbering starting at 0 so it is straightforward to calculate the offset in the file for a given page number. Page 231 is the 8KB section starting at byte offset 1892352 in the file (231 * 8192
).
To locate the row belonging to RID 1:231:1
it just needs to get that relevant page (1:231
) from the buffer manager (which will read it from disc if needed) and go to the second slot on the page (the slot numbering starts from 0).
Each data page has a slot array in the footer giving the offset of the row for every row on the page.