Search code examples
c#.netmemory-managementcollectionsmemory-mapped-files

Memory-mapped file IList implementation, for storing large datasets "in memory"?


I need to perform operations chronologically on huge time series implemented as IList. The data is ultimately stored into a database, but it would not make sense to submit tens of millions of queries to the database.

Currently the in-memory IList triggers an OutOfMemory exception when trying to store more than 8 million (small) objects, though I would need to deal with tens of millions.

After some research, it looks like the best way to do it would be to store data on disk and access it through an IList wrapper.

Memory-mapped files (introduced in .NET 4.0) seem the right interface to use, but I wonder what is the best way to write a class that should implement IList (for easy access) and internally deal with a memory-mapped file.

I am also curious to hear if you know about other ways ! I thought for example of an IList wrapper using data from db4o (someone mentionned here using a memory-mapped file as the IoAdapterFile, though using db4o probably adds a performance cost vs. dealing directly with the memory-mapped file).

I have come across this question asked in 2009, but it did not yield useful answers or serious ideas.


Solution

  • I found this PersistentDictionary<>, but it only works with strings, and by reading the source code I am not sure it was designed for very large datasets.

    More scalable (up to 16 TB), the ESENT PersistentDictionary<>, uses the ESENT database engine present in Windows (XP+) and can store all serializable objects containing simple types.

    Disk Based Data Structures, including Dictionary, List and Array with an "intelligent" serializer looked exactly like what I was looking for, but it did not run smoothly with extremely large datasets, especially as it does not make use of the "native" .NET MemoryMappedFiles yet, and support for 32 bits systems is experimental.

    Update 1: I ended up implementing my own version that makes extensive use of .NET MemoryMappedFiles; it is very fast and I will probably release it on Codeplex once I have made it better for more general purpose usages.

    Update 2: TeaFiles.Net also worked great for my purpose. Highly recommended (and free).