Currently I am reading about B+ Tree
basics, and got confused regarding space allocation for clustered and unclustered index.
When we create clustered index on B+ tree
, the index get stored in the main memory and the leaves contain the data pointers to actual blocks. The blocks are stored in disks, and the blocks contain record.
Now suppose we have a table (id, name, class) and I created two unclustered indexes on name
and class
. My doubt is where will the unclustered index will get stored? and how searching will be performed for a query
like
select id, name, class from table where id = 3, name='Leo' and class='10'
My assumption:
name
and class
, we will find the remaining fieldsDo you think my assumption is right? Could you elaborate more regarding storing the clustered index? Does the both the index (clustered and non clustered form a n-ary tree? ). I am not able to visualize both the clustered and unclustered index together.
I am speaking specifically about InnoDB...
The PRIMARY KEY
is (as you say) clustered with the data. The entire BTree for that (data + PK) is stored in one set of blocks on disk (not 'main memory'). The 'leaf' nodes contain all the columns.
A secondary key is a separate BTree. Structurally the two BTrees are the same with the exception of what is in the leaf nodes. For a secondary key, a copy of the PRIMARY KEY
is put into to the leaf nodes. Hence, when looking up one row ("point query") using a secondary index, there are two BTree drill-downs - one for the secondary index, and a secondary for the PK.
All blocks are 'cached' in the 'buffer_pool', so they are sometimes in main memory, but are always persisted (sooner or later) on disk. (The transaction log, etc) assure that "later" does not violate the rule that the data always persists.)
Your two pictures are a nice start. However...
What you need to know at a higher level than the pictures are trying to convey:
WHERE clustered_key BETWEEN ...
is very efficientWHERE secondary_key BETWEEN ...
is very efficient at finding the PK values it needs, but then turns into a bunch of (potentially) random point queries.Now for MyISAM:
(There are a lot more details.)