Search code examples
nosqlcassandrahector

Creating composite columns cassandra


I need to store benchmark runs for each nightly builds. To do this, i came up with the following data model.

BenchmarkColumnFamily= {

   build_1: {
       (Run1, TPS) : 1000K
       (Run1, Latency) : 0.5ms
       (Run2, TPS) : 1000K
       (Run2, Latency) : 0.5ms
       (Run3, TPS) : 1000K
       (Run3, Latency) : 0.5ms
    }

    build_2: {
       ...
    }
...

}

To create such a schema, i came up with the following command on cassandra-cli:

create column family BenchmarkColumnFamily with 
    comparator = 'CompositeType(UTF8Type,UTF8Type)' AND 
    key_validation_class=UTF8Type AND
    default_validation_class=UTF8Type AND
    column_metadata = [
    {column_name: TPS, validation_class: UTF8Type}
    {column_name: Latency, validation_class: UTF8Type}
    ];

Does the above command create the schema i intend to create? The reason for my confusion is that, when i insert data into the above CF using: set BenchmarkColumnFamily['1545']['TPS']='100'; it gets inserted successfully even though the comparator type is composite. Furthermore, even the following command gets executed successfully

set BenchmarkColumnFamily['1545']['Run1:TPS']='1000';

What is it that im missing?


Solution

  • I don't think you're doing anything wrong. The CLI is parsing the strings for values based on the type, probably using org.apache.cassandra.db.marshal.AbstractType<T>.fromString(). And for Composite types, it uses ':' as field separator (not that I've seen documented, but I've experimented with Java code to convince myself.

    Without a ':', it seems to just set the first part of the Composite, and leave the second as null. To set the second only, you can use

    set BenchmarkColumnFamily['1545'][':NOT_TPS']='999';
    

    From the CLI, dump out the CF:

    list BenchmarkColumnFamily;
    

    and you should see all the names (for all the rows), e.g.

    RowKey: 1545
    => (column=:NOT_TPS, value=999, timestamp=1342474086048000)
    => (column=Run1:TPS, value=1000, timestamp=1342474066695000)
    => (column=TPS, value=100, timestamp=1342474057824000)
    

    There is no way (via CLI) to constrain the composite elements to be non-null or specific values, that's something you'd have to do in code.

    Also, the column_metadata option for the CF creation is unnecessary, since you've already listed the default validation as UTF8Type.