Search code examples
sortingcassandrahector

Sort by key in Cassandra


Let's assume I have a keyspace with a column family that stores user objects and the key of these objects is the username.

How can I use Hector to get a list of users sorted by username?

I tried to use a RangeSlicesQuery, paging works fine with this query, but the results are not sorted in any way.

I'm an absolute Cassandra beginner, can anyone point me to a simple example that shows how to sort a column family by key? Please ask if you need more details on my efforts.

Edit:

The result was not sorted because I used the default RandomPartitioner instead of the OrderPreseveringPartitioner in cassandra.yaml.

Probably it's better not to rely on the sorting by key but to use a secondary index.


Solution

  • Quoting Cassandra - The Definitive Guide

    Column names are stored in sorted order according to the value of compare_with. Rows, on the other hand, are stored in an order defined by the partitioner (for example, with RandomPartitioner, they are in random order, etc.)

    I guess you are using RandomPartitioner which

    ... return data in an essentially random order.

    You should probably use OrderPreservingPartitioner (OPP) where

    Rows are therefore stored by key order, aligning the physical structure of the data with your sort order.

    Be aware of inefficiency of OPP.


    (edit on Mar 07, 2014)
    Important:

    This answer is very old now.

    It is a system-wide setting. You can set in cassandra.yaml. See this doc. Again, OPP is highly discouraged. This document is for version 1.1, and you can see it is deprecated. It is likely that it is removed from latest version. If you do want to use OPP, you may want to revisit the architecture the architecture.