Given that I want to create my own database store, what size should the files be to avoid fragmentation and filesystem overhead, especially in the light of the "new" SSDs?
Would a lot of 64 kbyte files be ok, for instance? Or would that be using up file (inode) entries at an alarming rate?
Is it better to use a huge file and access it only within 64 kbyte boundaries?
(I am using the 64 kbyte as an example. Maybe 4kbyte is the magic size? Also tell me if I am rambling or if I made my point across.)
Good questions.
The flash in modern SSD is usually(!) structured as follows: A page size of 2K or 4K than can be written and 256K erase blocks. A page cannot be overwritten without erasing it before. But the erase operation only works on full erase blocks. However, each erase operations takes a long time (in contrast to other IO operations) and slowly wears out the SSD.
A component of the SSD controller called FTL (Flash Transition Layer) is used to provide the illusion of a HDD-like block device on the flash semantics. SSD can be used like HDD, but to get the most out of it (and to do it for a long time) a software IO design incorporating the knowledge of the storage works best.
However, the SSD controller logic is usually not known. So it might differ from SSD to SSD, but here are a few rules of thumb:
If possible I would align my IO pattern and the file sizes to full erase blocks (or a multiple of it). So writing a file of 256K uses a full erase block without any internal fragmentation. Smaller files like 64K would use only a portion of it. Writing data to the rest of the block might lead to a read-modify-write cycle. This means that the complete block is read, modified and then written to another location. Very expensive.
This is not a problem when the SSD is empty (because the controller has enough unused blocks), but may become an issue if the SSD is full and also heavily used. Or if IO pattern are usually very small writes and the SSD becomes fragmentated. So that the FTL has a harder time finding consecutive free flash pages.
As a side note: the system administrator should align the filesystem to the SSD erase block boundaries, it's really important.