Search code examples
cassandra

Cassandra avoid ALLOW FILTERING


I have a table of this type:

CREATE TABLE MY_TABLE
(
A TEXT,
B TEXT,
TOTAL INT,
PRIMARY KEY (A, B )
);

I need to make queries from key A and B separately.

Example:

SELECT sum(total) FROM MY_TABLE WHERE A='A';

In this one return me the correct result.

Example:

SELECT sum(total) FROM MY_TABLE WHERE B='B';

With this one I get the error of allow filtering.

InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

How can I avoid using it? Should I separate the data in two tables?


Solution

  • Why do you get the issue ?

    There is something not obvious in the CQL Language: if the primary key is set up with a single parenthesis, the first argument is considered the partition key. So under the hood your table construction is:

    CREATE TABLE MY_TABLE (
      A TEXT,
      B TEXT,
      TOTAL INT,
      PRIMARY KEY ((A), B )
    );
    

    You want to sum the items of the same partition to avoid allow filtering so yes only A works.

    How to fix this ?

    • Multiple tables: the most performant and you will end up MY_TABLE
    • Create a table with counters to keep all totals
    • Use a secondary index on column B