I'm still trying to wrap my head around the whole concept of composite keys in cassandra. I picked up this piece of code from https://github.com/thobbs/phpcassa/blob/master/examples/composites.php and I am struggling to understand what this means (please see questions below / in comments):
$cf->insert_format = ColumnFamily::ARRAY_FORMAT;
$cf->return_format = ColumnFamily::ARRAY_FORMAT;
$key1 = array("key", 1); // Which one of these is a column name?
$key2 = array("key", 2);
$columns = array(
array(array(0, "a"), "val0a"), //Which is value, and which is column name?
array(array(1, "a"), "val1a"),
array(array(1, "b"), "val1b"),
array(array(1, "c"), "val1c"),
array(array(2, "a"), "val2a"),
array(array(3, "a"), "val3a")
);
/**
* What type of queries in (CQL if possible) can I achieve with this?
/
What I would like to understand is:
array("key", 1);
are key
and 1
the two columns composing this key, or 1
is a value of key
?array(array(0, "a"), "val0a")
?I'm new to NoSQL technologies and this is twisting my mind.
Thank you for your assistance :-)
EDIT
Just a few more questions:
I would like to have a column family with following structure:
CREATE COLUMN FAMILY users (
userid int,
username varchar,
firstname varchar,
lastname varchar,
PRIMARY KEY (userid,username)
)
// How can I represent this structure with Phpcassa?
// I tried to make every column `array("firstname" => "my name")`, but it didn't work
username = null
) and maybe add a value later?The next two lines of that example probably help:
$cf->insert($key1, $columns);
$cf->insert($key2, $columns);
I'm making slight guesses here since I don't know PHP, but it seems clear from the naming that cf
is the columnfamily, and the two insert()
calls are adding multiple columns to the two rows with keys $key1
and $key2
.
The rows keys are composite keys, i.e. the first row key is a composite of the string "key"
and the number 1
.
In phpcassa the composite keys are constructed as arrays, I believe.
$key1 = array("key", 1);
$key2 = array("key", 2);
Note that in the example, the row keys and the column keys are composite keys.
That makes $columns
an array of columns; each column needs a name (key) and a value...
So for example array(0, "a")
is a column name (the column names are also composite keys), and "val0a"
is a column value.
The data could be visualised as follows: first, the general layout of rows and columns in Cassandra (showing 2 rows each with 3 columns, for example). Note that the columns don't have to follow a tabular structure - we can have name3 in one row and name4 in another, or completely unrelated column names in different rows.
row1 -> name1 name2 name3 ...
val1 val2 val3 ...
row2 -> name1 name2 name4 ...
val1 val2 val4 ...
Next, using some of the specific (composite) keys from the example (2 rows of 6 columns). This is how it is actually stored (assuming that this is the correct sort order for these columns, which would depend on the comparator).
("key", 1) -> (0, "a") (1, "a") (1, "b") (1, "c") (2, "a") (3, "a")
"val0a" "val1a" "val1b" "val1c" "val2a" "val3a"
("key", 2) -> (0, "a") (1, "a") (1, "b") (1, "c") (2, "a") (3, "a")
"val0a" "val1a" "val1b" "val1c" "val2a" "val3a"
but because of the composite keys, you could visualise it with another level of nesting (here, just expanding the column keys). This gives the same kind of structure that Cassandra Supercolumns were sometimes used for:
("key", 1) -> 0 1 2 3
"a" -> "val0a" "a" -> "val1a" "a" -> val2a" "a" -> "val3a"
"b" -> "val1b"
"c" -> "val1c"
I suspect it would become clearer if you run the example and can see the outputs!
Update to address the extra questions:
I think you can independently decide whether to use composite row keys and column keys: see the configuration lines, one for the column keys which are Long, Ascii, and one of the row keys which are Ascii, Long.
"comparator_type" => "CompositeType(LongType, AsciiType)",
"key_validation_class" => "CompositeType(AsciiType, LongType)"
You can't have a null key - in Cassandra you would simply omit that column (because it isn't really a table) and add it later if you want.
Just a brief comment on your column family design (since this answer is getting very long!). I'd consider why you want a composite primary key - surely the userid should be unique anyway?
You can just use a row per user, keyed on userid (or on a composite of userid,username if you really need to), then a column for each of the other fields. Pretty much like a standard relational table. I don't see any need to use composite column names here. Maybe find some simpler phpcassa examples first before trying the composite keys...