Search code examples
performancenfskey-valuenon-relational-databasegdbm

Alternative or succesor to GDBM


We a have a GDBM key-value database as the backend to a load-balanced web-facing application that is in implemented in C++. The data served by the application has grown very large, so our admins have moved the GDBM files from "local" storage (on the webservers, or very close by) to a large, shared, remote, NFS-mounted filesystem.

This has affected performance. Our performance tests (in a test environment) show page load times jumping from hundreds of milliseconds (for local disk) to several seconds (over NFS, local network), and sometimes getting as high as 30 seconds. I believe a large part of the problem is that the application makes lots of random reads from the GDBM files, and that these are slow over NFS, and this will be even worse in production (where the front-end and back-end have even more network hardware between them) and as our database gets even bigger.

While this is not a critical application, I would like to improve performance, and have some resources available, including the application developer time and Unix admins. My main constraint is time only have the resources for a few weeks.

As I see it, my options are:

  1. Improve NFS performance by tuning parameters. My instinct is we wont get much out of this, but I have been wrong before, and I don't really know very much about NFS tuning.

  2. Move to a different key-value database, such as memcachedb or Tokyo Cabinet.

  3. Replace NFS with some other protocol (iSCSI has been mentioned, but i am not familiar with it).

How should I approach this problem?


Solution

  • Don't get too hung up on the “relational versus non-relational” comparison. It appears to be irrelevant for this issue.

    The line your application has crossed is a different one: from a small database on local fast file storage, to a large database accessed over the network. Crossing that line means you are now better served by a dedicated, network serviced, database management system. Whether the management server manages relational databases isn't relevant for that aspect.

    For getting it up and running quickly, MariaDB (the successor to MySQL) is probably your best bet. If you foresee it growing much beyond where it is now, you might as well put it in PostgreSQL since that's where it will need to go eventually anyway :-)