Search code examples
pythoncassandrapycassa

Querying Cassandra columns with Pycassa


I have a Cassandra DB table similar to this:

key | name | client
1     A      C1
2     B      C2
3     C      C1

I access my Cassandra db with Python (Pycassa).

Is there a way to query the database in order to get the clients with the most number of occurrences? For instance, in this case it is C1 with 2.

I am not sure if it is possible to directly query Cassandra with Pycassa. If it is possible, how could I achieve that, or should I use other tools?

Thanks

PS: I need to use nosql so please do not suggest me to use relational db.


Solution

  • You would need to separately keep track of the number of occurrences of each client. If perfect accuracy isn't required, you can use Cassandra's built-in distributed counters. Otherwise, you'll need to use a more accurate scheme (counting columns, periodic recounts, or both) or store a counter in a relational database.