I have about 44 million key-value pairs on my hand that need to be stored in my laptop. The keys and values are all short texts, and the keys are all unique. The total size of them is about several gigabytes.
I tried to store them in a GDBM database. The problem is whenever I insert a new record, GDBM would check the existence of the key, and it gets slower and slower as the database grows. My storing program ran for 2 hours without signs of finishing soon, so I gave up.
So now I’m looking for a key-value database system that
I don’t need to access the database often (maybe weekly, or monthly, that kind of frequency). It’s just for my personal use, so no master-slave/concurrency/load-balancing magic is needed to be involved. I may need to look up some entries from time to time, or sometimes iterate over the whole set once or twice for some statistics. I will not usually (maybe never) change the data once it’s initially stored.
I’m using a 2011 Macbook Pro (no SSD) with OS X, if it helps.
What database should I use?
UPDATE: I’ve been testing some of the key-value DBs, including Google’s leveldb.
…
…
…
Then I updated my public beta version of OS X 10.10 (which I was using for all these) to the release version, the problem disappeared… … … DBMs are all fast now actually.
Maybe you can try koyoto cabinet. http://fallabs.com/kyotocabinet/