I would like to use db4o as the backend of a custom cache implementation. Normally my program involves loading into memory some 40,000,000 objects and working on them simultaneously. Obviously this requires a lot of memory and I thought of perhaps persisting some of the objects (those not in a cache) to a db4o database. My preliminary tests show db4o to be a bit slower than I would like (about 1,000,000 objects took 17 minutes to persist). However, I was using the most basic setup.
I was doing something like this:
using (var reader = new FileUnitReader(Settings, Dictionary, m_fileNameResolver, ObjectFactory.Resolve<DataValueConverter>(), ObjectFactory.Resolve<UnitFactory>()))
using (var db = Db4oEmbedded.OpenFile(Db4oEmbedded.NewConfiguration(), path))
{
var timer = new Stopwatch();
timer.Start();
IUnit unit = reader.GetNextUnit();
while (unit != null)
{
db.Store(unit);
unit = reader.GetNextUnit();
}
timer.Stop()
db.Close();
var elapsed = timer.Elapsed;
}
Can anyone offer advice on how to improve performance in this scenario?
Well I think there are a few options to improve the performance in this situation.
I've also discovered that the reflection-overhead in such scenarios can become quite a large part. So you may should try the fast-reflector for your case. Note that the FastReflector consumes more memory. However in your scenario this won't really matter. You can the fast-reflector like this:
var config = Db4oEmbedded.NewConfiguration();
config.Common.ReflectWith(new FastNetReflector());
using(var container = Db4oEmbedded.OpenFile(config, fileName))
{
}
When I did similar tiny 'benchmarks', I discovered that a larger cache-size improves the performance also a little, even when you write to the database:
var config = Db4oEmbedded.NewConfiguration();
config.File.Storage = new CachingStorage(new FileStorage(), 128, 1024 * 4);
Other notes: The transaction-handling of db4o isn't really optimized for giant transactions. When you store a 1'000'000 objects in one transaction, the commit may take ages or you run out of memory. Therefore you may want to commit more often. For example commit after every 100'000 stored object. Of course you need to check if it really makes an impact for your scenario.