I have to design a web page where a group leader can invite people to join his/her group. My requirements are really simple.
No sending duplicate emails out, if person was already contacted.
Show the group leader a list of invites sorted by invite date in ascending order.
Seems easy. I created this table.
CREATE TABLE invites (
email_address text,
invite_date timeuuid,
PRIMARY KEY (email_address, invite_date)
) WITH CLUSTERING ORDER BY (invite_date ASC);
Problem 1: LWT no use with invite_date as a Cluster column.
I figured I'd use LWT to ensure email_address is unique, only to find out IF NOT EXISTS only seems to work on the whole PRIMARY KEY, so LWT in C* does not work for me.
Problem 2: I cannot get an ordered list of invites back to save me life even with invite_date as a Cluster column.
If I take invite_date out, I cannot issue an 'order by' in CQL. That said, having invite_date out of the PK let's me use LWT...
I can't even get a 2 column table to fulfill 2 easy requirements! Any help on data modeling design for this problem is much appreciated.
New Dec. 4, 2015:
Additional to business requirements, a technical requirement I have is: I want to make sure I model this correctly in Cassandra, so that it allows me to use CQL's LIMIT and pagingState capabilities in the Java driver. This means, I cannot just read all the rows in, sort on Java side and return the results.
Problem 1: I think that the easiest way to handle this might be to have two separate tables, one for the emails_in_group and one for invites_by_group. This will allow each query to be fulfilled independantly. The emails_in_group table would look something like this:
CREATE TABLE emails_in_group (
email_address text,
group_id text,
PRIMARY KEY (email_address , group_id));
Then this, combined with the table as defined in Problem 2 below could be updated using a conditional batch statement as shown here: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use-batch-static.html
Problem 2: So the basic problem here is that as you have your data currently modeled each email_address value will be in it's own partition and then within that partition the invite_date will be ordered. @bydsky is right when he said that you need to add something like a group_id to your table and make it the partition key portion of your Primary Key. If you do this and then add invite_date as a cluster column to the partition key then all records for that group_id will be stored in the same partition and the Order By will work. Order By only works within the same partiton, not across partitions which is what you were asking it to do.
CREATE TABLE invites_by_group (
group_id text,
email_address text,
invite_date timestamp,
PRIMARY KEY (group_id, invite_date));