I have created table in cassandra, where primary key is some column with timeuuid
as datatype. I am able to identify each record uniquely with just millisecond precision timestamp value stored as bigint
.
I have used java datastax driver to connect cassandra. Before inserting record into database I am converting millisecond timestamp into UUID for each record. Which is overhead and can be removed.
timeuuid
instead of bigint
considering records are able to identified without timeuuid's uniqueness ?timeuuid
and bigint
data type ?There shouldn't be very big impact for performance if you generate timeuuid from timestamp. timeuuid
is useful if you may have many events happening in the same millisecond, and you need sorting - with timeuuid
you may get up to 10,000 different values inside the millisecond. Typical use case is the table with structure like this:
create table tuuid (
pk int,
tuuid timeuuid,
....
....,
primary key (pk, tuiid));
In this case, you will get sorting (ascending or descending) together with uniqueness of values for tuuid
. Of course you can come with primary key of (pk, timestamp, random-value)
, but with timeuuid
you don't need to have an additional column for uniqueness. One of the drawback of timeuuid
is integration with Spark, for example, as it doesn't have this type, and may not able to perform pushing of the filters.
If you don't need uniqueness, then just switch to timestamp
- it's represented as 8-bytes long internally - the same as bigint
, but you don't need to do conversions yourself, etc.