Cassandra DB Query for System Date

I have one table customer_info in a Cassandra DB & it contains one column as billing_due_date, which is date field (dd-MMM-yy ex. 17-AUG-21). I need to fetch the certain fields from customer_info table based on billing_due_date where billing_due_date should be equal to system date +1.

Can anyone suggest a Cassandra DB query for this?

Solution

fetch the certain fields from customer_info table based on billing_due_date

transaction_id is primarykey , It is just generated through uuid()

Unfortunately, there really isn't going to be a good way to do this. Right now, the data in the customer_info table is distributed across all nodes in the cluster based on a hash of the transaction_id. Essentially, any query based on something other than transaction_id is going to read from multiple nodes, which is a query anti-pattern in Cassandra.

In Cassandra, you need to design your tables based on the queries that they need to support. For example, choosing transaction_id as the sole primary key may distribute well, but it doesn't offer much in the way of query flexibility.

Therefore, the best way to solve for this query, is to create a query table containing the data from customer_info with a key definition of PRIMARY KEY (billing_date,transaction_id). Then, a query like this should work:

> SELECT * FROM customer_info_by_date
  WHERE billing_due_date = toDate(now()) + 2d;

 billing_due_date | transaction_id                       | name
------------------+--------------------------------------+---------
       2021-08-20 | 2fe82360-e314-4d5b-aa33-5deee9f03811 | Rinzler
       2021-08-20 | 92cb9ee5-dee6-47fe-b372-0829f2e384cd |     Clu

(2 rows)

Note that for this example, I am using the system date plus 2 days out. So in your case, you'll want to adjust the "duration" aspect from 2d down to 1d. Cassandra 4.0 allows date arithmetic, so this should work just fine if you are on that version. If you are not, you'll have to do the "system date plus one" calculation on the app side.

Another way to go about this, would be to create a secondary index on billing_due_date, but I don't recommend that path as it will query multiple nodes to build the result set.