cassandra datastax cql cassandra-2.0 nosql

CQL - rows with different columns

Is it possible to store in Cassandra (using CQL) two rows in a one column family with the same column quantity but with a one different column? Something like this:

// column family
'users' : {

    // row 1
    'john' : {
        name: 'John',
        lastname: 'Smith',
        email: 'john@gmail.com'
    }

    // row 2
    'jack' : {
        name: 'Jack',
        lastname: 'Sparrow',
        age: 33
    }
}

My current CQL code:

CREATE KEYSPACE people WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
USE people;

CREATE COLUMNFAMILY users (
    username varchar PRIMARY KEY,
    name varchar,
    lastname varchar,
    email varchar
);

INSERT INTO users (username, name, lastname, email) VALUES ('john', 'John', 'Smith', 'john@gmail.com');

ALTER TABLE users DROP email;
ALTER TABLE users ADD age int;

INSERT INTO users (username, name, lastname, age) VALUES ('jack', 'Jack', 'Sparrow', 33);

SELECT * FROM users;

OUTPUT:

 username | age  | lastname | name
----------+------+----------+------
     john | null |    Smith | John
     jack |   33 |  Sparrow | Jack

Solution

Cassandra has a sparse data model. We won't create a cell in the underlying sstable if you don't insert a value for a given cql row.

I.E. if you hadn't said DROP email, Jack would not have an empty cell for email unless you deliberately inserted a null / tombstone.

You can introspect sstables using sstable2json to understand how the data is laid out on disk. Remember to use nodetool flush before you try this or you may end up introspecting an empty directory!