Search code examples
databasedisk

Does DBMS actually bypass OS and run over filesystem to manage disk space?


I am working on the slides offered by UCB cs186 fall 2020 course, it says:

In terms of disk space management, there are 2 proposals:

  • Talk to the storage device directly, or
  • Run our own over filesystem (FS). Bypass the OS, allocate a single large “contiguous” file on an empty disk

I don't get the 2nd proposal. I do understand that leveraging filesystem is great because it does a lot for us, but:

  • Why bypassing OS? Why not use filesystem APIs offered through OS to manage disk space?
  • What does "use FS but bypass OS" really mean - I thought most systems use FS through OS. Is that not the case in the world of DBMS?

Solution

  • DBMS aims to solve the problem: disk space is large but slow, memory is small but fast, how to make our DB large and fast? As a result, it needs to solve both memory management and disk management.

    Typically, DBMS relies on OS filesystem for disk management but will bypass OS (i.e. mmap) for memory (aka buffer pool) management.

    • Disk management: Very few DBMS (BlueStore, ScyllaDB) do bypass OS filesystem and talk to the raw device directly, but due to the issues such as complexity, portability, and insignificant speedup (~10% according to Andy Pavlo), they’re not common.

    • Memory management: Most DBMS has a logical understanding of workload/transactions, while OS is unaware of the relationship between different buffers in memory. This makes it beneficial for DB to manage memory on its own.

    Credits to Aashray#4143 and miller#0114 in CMU 15-445 unofficial community (Discord).