Search code examples
scalacassandraphantom-dsl

Phantom DSL questions on set contains and sorting


I have written this scala code using Phantom DSL to query cassandra

  def getByGenreAndYear(genre: List[String], year: Int) : Future[Seq[Movie]] = {
    var criteria = select.where(_.genre contains genre.head)
    criteria = genre.tail.foldLeft(criteria){(accum, i) => accum.and(_.genre contains i)}
    criteria.and(_.year eqs year)
    criteria.allowFiltering().fetch()
  }

It works but I have a few questions

  • Set Contains

When querying whether a set contains values. is it right to build the query criteria like I have done? basically I have a AND clause for every value for which we want to check. Could this have been done in a single shot like

select.where(_.genre contains genreList)
  • Sorting

I am not able to generate the sort query. when I try to do

  def getByGenreAndYear(genre: List[String], year: Int) : Future[Seq[Movie]] = {
    var criteria = select.where(_.genre contains genre.head)
    criteria = genre.tail.foldLeft(criteria){(accum, i) => accum.and(_.genre contains i)}
    criteria.and(_.year eqs year)
    criteria.orderBy(_.year desc)
    criteria.allowFiltering().fetch()
  }

The code desn't even compile


Solution

  • Contains queries

    You cannot do contains queries on multiple values at the same time. You have several ways of achieving the above. The first is to use filtering and build a query.

    def getByGenreAndYear(genre: List[String], year: Int): Future[Seq[Movie]] = {
        val rootQuery = select.where(_.genre contains genre.head)
        genre.tail.foldLeft(rootQuery){ (accum, i) => accum.and(_.genre contains i)}
       .and(_.year eqs year)
       .orderBy(_.year desc)
       .allowFiltering().fetch()
      }
    

    This is what you are doing here except the phantom query builder is immutable, every single operation you do will create a new Query instance. There is a very good reason for that.

    The other way is to to sequence futures and not filter in Cassandra, which is not always very advisable.

    def getByGenreAndYear(genre: List[String], year: Int): Future[Seq[Movie]] = {
      // This will create a future to query for a single value.
      val futures = genre.map(item => select.where(_.year eqs year).and(_.genre contains item).fetch())
      // This will sequence the entire set, produce a list of lists, flatten it and create an union, and deduplicate by set conversion granted you define the right `hashCode` method on the `Movie` class.
      Future.sequence(futures) map {
       // You could also probably get away with lists.flatten
       lists => lists.foldRight(Nil)((item, acc) => item ::: acc)).toSet
      }
    }
    

    At this point in time doing a CONTAINS query is not possible against multiple values in a single query. You get an error:

    cql select * from marvis.expenses where tags contains ('food', 'office-food'); InvalidRequest: code=2200 [Invalid query] message="Invalid tuple type literal for value(tags) of type text"

    This however works:

    select * from marvis.expenses where tags contains 'food' and tags contains 'office-food' ALLOW FILTERING;

    Sorting

    To achieve sorting, you need a Compound or Composite key, and you can only sort by the clustering key part of the column, not the Partition key part. Have a look at this tutorial for more details on Cassandra indexing if need be.