I am trying to remodel a SQL database Cassandra such that, I can find the Cassandra equivalent for the SQL queries. I use CQL 3 and Cassandra v1.2. I modeled the db design in cassandra so that it supports the order by clauses and denormalized tables to support the join operation. However I am at sea when it comes to DISTINCT, SUM() and GROUPBY equvalents
SELECT a1,MAX(b1) FROM demo1 group by a1.
SELECT DISTINCT (a2) FROM demo2 where b2='sea'
SELECT sum(a3), sum(b3) from demo3 where c3='water' and d3='ocean'
This is like a showstopper to my work for past couple of days. Is there a way in Cassandra, that I can model the db schema to support queries of these kind? I cant think of any way in Cassandra . How are such queries be implemented using Cassandra?
I read that a hive layer over Cassandra can possibly make these queries work. I am just wondering if that is the only way that such queries can be supported in Cassandra..? Pls advise on any other possible methods..
Cassandra doesn't support operations like this. You can use something like Hive on top or there's a (non-free) product from Acunu that may do what you need.
The other solution is to do the work yourself. For example, you can sum things by reading in all the data from certain rows and summing. Or maintain a Cassandra counter to increment on the fly.