Search code examples
javacassandradatastaxcqldatastax-java-driver

How to get next resultset in Cassandra DB


Table contains more than 100,000 rows. I am trying to get the total row count. I do not want to use async.

Statement statement = session.prepare("SELECT user_id from table");
ResultSet resultSet = session.execute(statement);
int count = resultSet.getAvailableWithoutFetching()

I am getting 5000 rows from the above. Is it possible to iterate and get the next 5000 resultsets till 100,000 so I can sum up the rows to get the total count. I am getting timeout error when using session.execute(statement.setPageSize(100000));

  • DSE - 6.8.9
  • Cql - 3.4.5
  • Cassandra - 4.0.0.681
  • Java driver - 4.6.1

Solution

  • Are you able to modify your database query? If so, you could modify your query to SELECT COUNT(user_id) FROM table and then you will only need to look at one row of data.

    If not, it looks like you need to use the .fetchMoreResults() method to get the next results. I see you don't want to use async at all. That can be avoided by just calling .get()

    Statement statement = session.prepare("SELECT user_id from table");
    ResultSet resultSet = session.execute(statement);
    int count = resultSet.getAvailableWithoutFetching();
    while (!resultSet.isExhausted()) {
        resultSet = resultSet.fetchMoreResults().get();
        count += resultSet.getAvailableWithoutFetching();
    }
    

    Full disclosure; I have not used DSE before, this is just what I could glean from their docs. Things that will need to be checked still, that I can think of:

    1. should the condition be checking isExhausted() or isFullyFetched()?
    2. does getAvailableWithoutFetching() return the number of results in the current page, or the total number of results fetched so far? I've assumed the former, but if it is the latter then count should just be set based on the last resultSet's available size, instead of being added to for each page

    EDIT: I see you have version 4.6.1, ok. It seems like the pagingState() is worth looking into.

    Statement statement = session.prepare("SELECT user_id from table");
    ResultSet resultSet = session.execute(statement);
    int count = resultSet.getAvailableWithoutFetching();
    ByteBuffer pagingState = resultSet.getExecutionInfo().getPagingState();
    while (pagingState != null) {
        statement = statement.copy(pagingState);
        resultSet = session.execute(statement);
        pagingState = resultSet.getExecutionInfo().getPagingState();
        count += resultSet.getAvailableWithoutFetching();
    }
    return count;