I have a table like this.
> CREATE TABLE docyard.documents (
> document_id text,
> namespace text,
> version_id text,
> created_at timestamp,
> path text,
> attributes map<text, text>
> PRIMARY KEY (document_id, namespace, version_id, created_at) ) WITH CLUSTERING ORDER BY (namespace ASC, version_id ASC, created_at
> ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32'}
> AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
I want to be able to do the range queries on following conditions-
select * from documents where namespace = 'something' and created_at> 'some-value' order by created_at allow filtering;
select from documents where namespace = 'something' and path = 'something' and created_at> 'some-value' order by created_at allow filtering;
I am not able to make these queries work in any manner. Tried secondary indexes as well. Can anyone please help?
I keep getting some or the other when trying to make it work.
First of all, don't use secondary indexes or ALLOW FILTERING
. With timeseries data that will perform terribly over time.
To satisfy your first query, you will want to restructure your PRIMARY KEY and CLUSTERING ORDER like this:
PRIMARY KEY (namespace, created_at, document_id) )
WITH CLUSTERING ORDER BY (created_at DESC, document_id ASC);
This will allow for the following:
namespace
.created_at
in DESCending order (most-recent rows read first).document_id
ALLOW FILTERING
or ORDER BY
in your query, as the necessary keys will be provided, and the results will already be sorted to your CLUSTERING ORDER.For your second query, I would create an additional query table. This is because in Cassandra, you need to model your tables to suit your queries. You may end-up having several query tables for the same data, and that's ok.
CREATE TABLE docyardbypath.documents (
document_id text,
namespace text,
version_id text,
created_at timestamp,
path text,
attributes map<text, text>
PRIMARY KEY ((namespace, path), created_at, document_id) )
WITH CLUSTERING ORDER BY (created_at DESC, document_id ASC);
This will:
namespace
and path
.namespace
and path
to be sorted according to your CLUSTERING ORDER.ALLOW FILTERING
or ORDER BY
in your query.