innodb_buffer_pool_size - This parameter is used to define the memory allocated to cache data which is frequently used. I have many tables in my schema which has million records in it. when I checked buffer cache usage it is occupied all the allocated memory.
Is there any way to identify the what are the tables or indexes are being cached and for which cache hit is higher than others? Once I get it finally I may delete some data from those data.
If table contains 1000 records and if a query executed with condition on primary key. Does it cache whole index in buffer cache?
If none of above is the solution, I can reduce size for MySql buffer. If there is any other way to decrease the memory usage for MySql process as I want to allocate to other process.
No.
The "cache" works on a block-by-block base, not whole table, nor whole indexes.
- The data of a table is organized in a BTree of 16KB blocks.
- The each secondary
INDEX
of a table is organized in a BTree of 16KB blocks.
- Each block is loaded into the buffer_pool and bumped from the buffer_pool independently of other blocks.
- The cache is (roughly) "least-recently-organized". (Again, I emphasize that "block" is the unit, not table or index.)
- The root node of a BTree is more heavily used, so it tends to stay in the buffer_pool. You might find that that block of every table touched since restarting is still in the buffer_pool. Yet, lots of other blocks may have come and gone.
To your questions:
- Is there any way to identify the what are the tables or indexes are being cached and for which cache hit is higher than others? -- Not easily. Even if you had the info, it would not be useful (see above).
- Once I get it finally I may delete some data from those data. -- No. Even if you could, it would not be of much use.
- If table contains 1000 records and if a query executed with condition on primary key. -- Since the data is sorted by the PK, those 1000 rows would be "clustered" together in consecutive blocks. I would estimate only about 10 blocks.
- Does it cache whole index in buffer cache? -- As said above, only block by block.
More to note:
- A simple Rule of Thumb is that a BTree block contains 100 items.
- This implies that a million-row BTree (table or index) will be about 3 (100^3 = 1M) level deep. and there will be about 10000 (1M/100) leaf blocks.
- (Continuing with a 1M-row table) Fetching 1 row via the PK will bring about 3 blocks into the cache.
- If a block is not in cache, it needs to bump out an 'old' (LRU) block to make room, then do a disk read (slow). If the block is already in cache, no I/O needed.
- Fetching 1 row via a secondary key will need about 6 blocks. That's 3 for drilling down the secondary index's BTree, then 3 more to find the record in the data's BTree. (The PK was implicitly in the "row" of the secondary index.)
- To fetch 1000 consecutive rows (eg, via a 'range' query) via the primary key: 3 + 10 blocks need to be in the cache, so 0 to 13 disk reads.
- To fetch 1000 consecutive rows (eg, via a 'range' query) via a secondary index: 3 + 10 blocks in the index, then 3 in the data, plus between 10 and 1000 more.
OK, I have rambled on too long. What would you like to know next?