Search code examples
mongodbcassandrahbasedatabasenosql

NoSQL database for an address book with billions of records


Which database is a suitable choice to store an address book with billions of rows (name, email address, phone number, etc. )? The application will be very read intensive (>99%) and need high consistency available with servers distributed worldwide. The query will be on either email address or phone number. I am currently considering HBase, Cassandra or MongoDB.


Solution

  • Cassandra might be a good choice for that. It has support for multiple data centers so for worldwide support you can set up a few DC's around the world to reduce latency by having clients access the nearest data center.

    For fast lookups based on email address and phone number you'd probably store the data denormalized in two tables, with one table using email as the primary key and another table using phone number as the primary key.

    You should be able to get the read performance you want by adding more nodes, since read performance would scale with the number of nodes you had in each data center.

    Now if you want to do ad hoc queries of this data based on fields other than the primary key, then Cassandra would not be a good choice.