Search code examples
databasesqlitefilesystemsarchiventfs

FILESYSTEM vs SQLITE, while storing up-to 10M files


I would like to store up-to 10M files, 2TB storage unit. The only properties which I need restricted to filenames, and their contents (data).

The files max-length is 100MB, most of them are less than 1MB. The ability of removing files is required, and both writing and reading speeds should be a priority - while low storage efficiency, recovery or integrity methods, are not needed.

I thought about NTFS, but most of its features are not needed, while can't be disabled and considered to be an overhead concern, a few of them are: creation date, modification date, attribs, journal and of course permissions.

Due to the native features of a filesystem which are not needed, would you suggest I'll use SQLITE for this requirement? or there's an obvious disadvantage that I should be aware about? (one would guess that removing files will be a complicated task?)

(SQLITE will be via the C api)

My goal is to use a more suited solution to gain performance. Thanks in advance - Doori Bar


Solution

  • If your main requirement is performance, go with native file system. DBMS are not well suited for handling large BLOBs, so SQLite is not an option for you at all (don't even know why everybody considers SQLite to be a plug for every hole).

    To improve performance of NTFS (or any other file system you choose) don't put all files into single folder, but group files by first N characters of their file names, or also by extension.

    Also there exist some other file systems on the market and maybe some of them offer possibility to disable some of used features. You can check the comparison on Wikipedia and check them.

    Correction: I've made some tests (not very extensive though) that show no performance benefit in grouping files into subdirectories for most types of operations, and NTFS quite efficiently handled 26^4 empty files named from AAAA to ZZZZ in a single directory. So you need to check efficiency for your particular file system.