Search code examples
javacassandracqlcassandra-3.0datastax-java-driver

cassandra lookup by list of primary keys in java


I am implementing a feature which requires looking up Cassandra by a list of primary keys.

Below is an example data where id is primary key

mytable
id          column1
1           423
2           542
3           678
4           45534
5           435634
6           2435
7           678
8           4564
9           546

Most of my queries a lookup by id, but for some special cases I would like to get data for a list of ids. The way I am currently doing is a follows:


public Object fetchFromCassandraForId(int id);

int ids[] = {1, 3, 5, 7, 9};
List<Object> results;
for(int id: ids) {
  results.add(fetchFromCassandraForId(id));
}

This results in issuing multiple network call to cassandra, Is it possible to batch this somehow, therefore i would like to know if cassandra supports fast lookup by list of ids

select coulmn1 from mytable where id in (1, 3, 5, 7, 9);

? Any help or pointers would be appreciated?


Solution

  • If the id is the full primary key, then Cassandra supports this, although it's not recommended from performance point of view:

    • request is sent to coordinator node
    • coordinator node finds a replica for each of the id, and send individual request to them
    • wait for results from every node, collect them to result set & send back

    As result:

    • all your sub-queries need to wait for slowest of the replicas
    • you have an additional network hope from coordinator to replica
    • you put more pressure to the coordinator node as it need to keep results in memory

    If you do a lot of parallel, asynchronous requests for each of the id values from application, then you:

    • avoid an additional hop - if you're using prepared statements with token-aware load balancing, then query is sent directly to replicas
    • you may start to process results as you get them, not waiting for everything

    So sending parallel asynchronous requests could be faster than sending one request with IN...