Search code examples
mongodbgridfs

GridFS disk management


In my environments I can have DB of 5-10 GB or DB of 10 TB (video recordings).
Focusing on the 5-10 GB: if I keep default settings for prealloc an small-files I can actually loose 20-40% of the disk space because of allocations.
In my production environments, the disk size can be 512G, but user can limit DB allocation to only 10G.

To implement this, I have a scheduled task that deletes the old documents from the DB when DB dataSize reached a certain threshold.

I can't use capped-collection (GridFS, sharding limitation, cannot delete random documents..), I can't use --no-prealloc/small-files flags, cause i need the files insert to be efficient.

So what happens, is this: if dataSize gets to 10G, the fileSize would be at least 12G, so I need to take that in consideration and lower the threshold in 2GB (and lose a lot of disk space).

What I do want, is to tell mongo to pre-allocate all the 10 GB the user requested, and disable further pre-alloc.

For example, running mongod with --no-prealloc and --small-files, but pre-allocate in advance all the 10 GB.

Another protection I gain here, is protecting the user against sudden disk-full errors. If he regularly downloads Game of Thrones episodes to the same drive, he can't take space from the DB 10G, since it's already pre-allocated.

(using C# driver)


Solution

  • I think I found a solution: You might want to look at the --quota and --quotafiles command line opts. In your case, you also might want to add the --smalfiles option. So

    mongod --smallfiles --quota --quotafiles 11
    

    should give you a size of exactly 10224 MB for your data, which, adding the default namespace file size of 16MB equals your target size of 10GB, excluding indices.