I want to understand better index organization. Imagine we have a table with 2 columns:
CREATE TABLE user(
name varchar(100)
,age int)
We would like to create an index:
CREATE INDEX IDX_MultiColIdx on user(name,age)
How would B-Tree index organization look like?
In case of one column, say, age, the organization is clear: every non-leaf node would contain a set of integer keys which would be used for a search. Which values contain nodes of our IDX_MultiColIdx B-Tree index?
Which values contains nodes of our IDX_MultiColIdx B-Tree index?
Values of name
, age
and the row pointer (RID
/ROWID
or clustered key, depending on the table organization), sorted lexicographically.
How exactly they will be stored, depends on the datatype and database system.
Usually, CHAR
is stored right-padded with spaces up to its size, while VARCHAR
is prepended with its length.
MyISAM
and some other engines can use key compression: the matching parts of a set of keys are only stored once, and the other keys only store the differing parts, like this:
Hamblin
Hamblin, California
Hamblin (surname)
Hambling Baronets
Hambly
Hambly Arena
Hambly Arena Fire
Hambo
Hambo Lama Itigelov
Hambok
Hambone
will be stored as:
Hamblin
[7], California
[7] (surname)
[7]g Baronets
Hambly
[6] Arena
[6] Arena Fire
Hambo
[5] Lama Itigelov
[5]k
[5]ne
, where [x]
means "take leading x
characters from the previous key"