Search code examples
javacassandrahectorcomposite-types

Querying CompositeType columns in Cassandra using Hector


Here's a sample of the scenario I'm facing. Say I have this column family:

    create column family CompositeTypeCF 
    with comparator = 'CompositeType(IntegerType,UTF8Type)' 
    and key_validation_class = 'UTF8Type' 
    and default_validation_class = 'UTF8Type'

Here's some sample Java code using Hector as to how I'd go about inserting some data into this column family:

 Cluster cluster = HFactory.getOrCreateCluster("Test Cluster", "192.168.1.6:9160");
 Keyspace keyspaceOperator = HFactory.createKeyspace("CompositeTesting", cluster);
 Composite colKey1 = new Composite();
 colKey1.addComponent(1, IntegerSerializer.get());
 colKey1.addComponent("test1", StringSerializer.get());
 Mutator<String> mutator = HFactory.createMutator(keyspaceOperator, StringSerializer.get());
 Mutator<String> addInsertion = mutator.addInsertion("rowkey1", "CompositeTypeCF",
     HFactory.createColumn(colKey1, "Some Data", new CompositeSerializer(), StringSerializer.get()));
 mutator.execute();

This works, and if I go to the cassandra-cli and do a list I get this:

$ list CompositeTypeCF;

Using default limit of 100
-------------------
RowKey: rowkey1
=> (column=1:test1, value=Some Data, timestamp=1326916937547000)

My question now is this: How do I go about querying this data in Hector? Basically I would need to query it in a few ways:

  1. Give me the whole row where Row Key = "rowkey1"
  2. Give me the column data where the first part of the column name = some integer value
  3. Give me all the columns where the first part of the column name is within a certain range

Solution

  • Good starting point tutorial here.

    But, after finally having the need to use a composite component and attempting to write queries against the data, I figured out a few things that I wanted to share.

    When searching Composite columns, the results will be a contiguous block of columns.

    So, assuming a s composite of 3 Strings, and my columns look like:

    A:A:A
    A:B:B
    A:B:C
    A:C:B
    B:A:A
    B:B:A
    B:B:B
    C:A:B
    

    For a search from A:A:A to B:B:B, the results will be

    A:A:A
    A:B:B
    A:B:C
    A:C:B
    B:A:A
    B:B:A
    B:B:B
    

    Notice the "C" Components? There are no "C" components in the start/end terms! what gives? These are all the results between A:A:A and B:B:B columns. The Composite search terms do not give the results as if processing nested loops (this is what I originally thought), but rather, since the columns are sorted, you are specifying the start and end terms for a contiguous block of columns.

    When building the Composite search entries, you must specify the ComponentEquality

    Only the last term should be GREATER_THAN_EQUAL, all the others should be EQUAL. e.g. for above

    Composite start = new Composite();
    start.addComponent(0, "A", Composite.ComponentEquality.EQUAL);
    start.addComponent(1, "A", Composite.ComponentEquality.EQUAL);
    start.addComponent(2, "A", Composite.ComponentEquality.EQUAL);
    
    Composite end = new Composite();
    end.addComponent(0, "B", Composite.ComponentEquality.EQUAL);
    end.addComponent(1, "B", Composite.ComponentEquality.EQUAL);
    end.addComponent(2, "B", Composite.ComponentEquality.GREATER_THAN_EQUAL);
    
    SliceQuery<String, Composite, String> sliceQuery = HFactory.createSliceQuery(keyspace, se, ce, se);
    sliceQuery.setColumnFamily("CF").setKey(myKey);
    ColumnSliceIterator<String, Composite, String> csIterator = new ColumnSliceIterator<String, Composite, String>(sliceQuery, start, end, false);
    
    while (csIterator.hasNext()) ....