Search code examples
nosqldatastoredata-storage

I am looking for a key-value datastore which has following (preferable) properties


I am trying to build a distributed task queue, and I am wondering if there is any data store, which has some or all of the following properties. I am looking to have a completely decentralized, multinode/multi-master self replicating datastore cluster to avoid any single point of failure.

Essential

  • Supports Python pickled object as Value.
  • Persistent.

More, the better, In decreasing order of importance (I do not expect any datastore to meet all the criteria. :-))

  • Distributed.
  • Synchronous Replication across multiple nodes supported.
  • Runs/Can run on multiple nodes, in multi-master configuration.
  • Datastore cluster exposed as a single server.
  • Round-robin access to/selection of a node for read/write action.
  • Decent python client.
  • Support for Atomicity in get/put and replication.
  • Automatic failover
  • Decent documentation and/or Active/helpful community
  • Significantly mature
  • Decent read/write performance

Any suggestions would be much appreciated.


Solution

  • Cassandra (open-sourced by facebook) has pretty much all of these properties. There are several Python clients, including pycassa.

    Edited to add:

    Cassandra is fully distributed, multi-node P2P, with tunable consistency levels (i.e. your replication can be synchronous or asynchronous or a mixture of both). Clients can connect to any server. Failover is automatic, and new servers can be added on-the-fly for load balancing. Cassandra is in production use by companies such as Facebook. There is an O'Reilly book. Write performance is extremely high, read performance is also high.