Search code examples
performancedatabase-performancecassandra-2.0

Scaling Cassandra for Multiple vs Single table


I have a use case where i am going to store activities of user.

I am considering 2 approaches

  1. Creating table for every user
  2. Creating one single table.

Now performance wise fetching data for single user becomes easier in the 1st approach as compared to the 2nd approach where we are going to store data for all the user.

Is there a limit to the number of tables we can have in Cassandra?

I have read post for other relational db where they don't recommend using multiple table.

I tried both the approach in Cassandra for single table as well as multiple table.

For Multiple table i am worried about the increasing tables in db.

For single table i am worried about the number of rows increasing beyond billion.

Please can anyone suggest me which approach should i use.


Solution

  • NoSQL DBs are designed for better horizontal scalability. If you take any benchmarks of cassandra it will prove its horizontal scaling capability.

    Approach 1:

    Though your tables increase you can put them across different servers (sharding). you dont have to worry. In future if your users increase to very high volume then creating a new table for each user might not be feasible (not from performance perspective at least.) Think of good sharding strategy (based on Region, data size etc.)

    Approach 2:

    In case of single table billion rows are not a problem (that will also give good performance.) Single server with good configuration and proper DB tuning (buffers, indexing, queries) will also provide good results.

    IMHO Choose strategy based on easiness to code and use and maintainability from current and future perspective (I think 2nd option is better if no. of users are less and constant over time).