I have to store a protocol buffer structure in Cassandra 3.x. It is defined in a .proto file as:
message Attribute
{
required string key = 1;
oneof value {
int64 integerValue = 2;
float floatValue = 3;
string stringValue = 4;
}
}
To store multiple Attributes
I was thinking about this CQL definition.
CREATE TABLE ... attributes: map<text, tuple<int, float, text> ...
and in each tuple 2 of 3 components would actually be null
. I haven't tested this syntax yet but are there any downsides using this approach? Maybe there is a better way, i.e. User Defined Types?
Let's try this out. I'll start with a simple table, containing a valuemap
column of type map<text,tuple<int,float,text>
as you have above:
CREATE TABLE tupleTest (
key text,
value text,
valuemap map<text, FROZEN<tuple<int,float,text>>>,
PRIMARY KEY (key));
I'll INSERT
some data:
INSERT INTO tupletest (key,value,valuemap) VALUES ('1','A',{'a':(0,0.0,'hi')});
INSERT INTO tupletest (key,value,valuemap) VALUES ('2','B',{'b':(0,null,'hi')});
INSERT INTO tupletest (key,value,valuemap) VALUES ('3','C',{'c':(null,null,'hi')});
And then I'll SELECT
it, just to see:
aploetz@cqlsh:stackoverflow> SELECT * FROM tupletest ;
key | value | valuemap
-----+-------+---------------------------
3 | C | {'c': (None, None, 'hi')}
2 | B | {'b': (0, None, 'hi')}
1 | A | {'a': (0, 0, 'hi')}
(3 rows)
The main apprehension about explicitly INSERT
ing NULL values into Cassandra, is that in "normal" columns they actually create tombstones. But since we are not setting an entire column to NULL, merely an element in a tuple (nested inside a map), this is not the case. In fact, they are showing as None
. And when I view the underlying SSTables, I also do not see evidence that a tombstone has been written.
Normally, I'd say that explicitly INSERT
ing a NULL into Cassandra is a terrible, terrible idea. But in this case, it shouldn't cause you any issues. Now, as to whether or not this is considered to be "legit" or a good practice...well, my data modeling senses do not approve. I would find another way to represent the absence of a value in a tuple type, as someone (the developer who follows you) could see this and interpret that as being "ok" to explicitly INSERT
NULLs into other column values.