Search code examples
sqlitesd-cardpremature-optimization

Fragmentation in SQLite used in a round-robin fashion without VACUUM


There's an SQLite database being used to store static-sized data in a round-robin fashion.

For example, 100 days of data are stored. On day 101, day 1 is deleted and then day 101 is inserted.

The number of rows is the same between days. The the individual fields in the rows are all integers (32-bit or less) and timestamps.

The database is stored on an SD card with poor I/O speed, something like a read speed of 30 MB/s.

VACUUM is not allowed because it can introduce a wait of several seconds and the writers to that database can't be allowed to wait for write access.

So the concern is fragmentation, because I'm inserting and deleting records constantly without VACUUMing. But since I'm deleting/inserting the same set of rows each day, will the data get fragmented? Is SQLite fitting day 101's data in day 1's freed pages?

And although the set of rows is the same, the integers may be 1 byte day and then 4 bytes another. The database also has several indexes, and I'm unsure where they're stored and if they interfere with the perfect pattern of freeing pages and then re-using them.

(SQLite is the only technology that can be used. Can't switch to a TSDB/RRDtool, etc.)


Solution

  • SQLite will reuse free pages, so you will get fragmentation (if you delete so much data that entire pages become free).

    However, SD cards are likely to have a flash translation layer, which introduces fragmentation whenever you write to some random sector.

    Whether the first kind of fragmentation is noticeable depends on the hardware, and on the software's access pattern. It is not possible to make useful predictions about that; you have to measure it.


    In theory, WAL mode is append-only, and thus easier on the flash device. However, checkpoints would be nearly as bad as VACUUMs.