I'm trying Cassandra with simple CRUD operations and don't understand how should I model the data.
Let's say, we need to manage simple user data:
UserId | Email | Name
We want to be able to GET information by either UserId
or Email
. Also we want to be able to change user info, i.e. Email
and Name
.
That leads me to a dilemma: to query by Email
, I should add it to PRIMARY KEY. But if I index it, I won't be able to UPDATE it.
How should I change the data model or indexing to be able to UPDATE the data?
From what I've read, secondary indexes are evil in Cassandra and I shouldn't use them to keep Cassandra's performance on a good level.
Indeed you should not use secondary indexes unless you absolutely have to. But if you need to search by an email, you can create another table with 2 columns - Email
and UserId
. The primary key will be Email
and that is how you will be searching for a UserId
by Email
. Think of it as of an index in a traditional relational database. Since the Email
value should be unique - the lookup table approach should be more efficient than a secondary index.
Once you found UserId
by Email
you can use it in the queries to the main table.