Search code examples
cassandramessageboxdatamodel

Design data model for messaging system with Cassandra


I am new to Cassandra and trying to build a data model for messaging system. I found few solutions but none of them exactly match my requirements. There are two main requirements:

  1. Get a list of last messages for a particular user, from all other users, sorted by time.
  2. Get a list of messages for one-to-one message history, sorted by time as well.

I thought of something like this,

CREATE TABLE chat (
  to_user text,
  from_user_text,
  time text,
  msg text,
  PRIMARY KEY((to_user,from_user),time) 
  ) WITH CLUSTERING ORDER BY (time DESC);

But this design has few issues, like I wont be able to satisfy first requirement since this design requires to pass from_user as well. And also this would be inefficient when number of (to_user,from_user) pair increases.


Solution

  • You are right. That one table won't satisfy both queries, so you will need two tables. One for each query. This is a core concept with Cassandra data modeling. Query driven design.

    So the query looking for messages to a user:

    CREATE TABLE chat (
      to_user text,
      from_user_text,
      time text,
      msg text,
      PRIMARY KEY((to_user),time) 
      ) WITH CLUSTERING ORDER BY (time DESC);
    

    Messages from a user to another user.

    CREATE TABLE chat (
      to_user text,
      from_user_text,
      time text,
      msg text,
      PRIMARY KEY((to_user),from_user,time) 
      ) WITH CLUSTERING ORDER BY (time DESC);
    

    Slight difference from yours: from_user is a clustering column and not a part of the partition key. This is minimize the amount of select queries needed in application code.

    It's possible to use the second table to satisfy both queries, but you will have to supply the 'from_user' to use a range query on time.