I have a pretty simple question which I can't find an answer to on the Internet or on stackoverflow:
Is the number of Parameters in the IN-Operator in Cassandra limited?
I have made some tests with a simple table with Integer-Keys from 1 to 100000. If I put the keys from 0 to 1000 in my IN-Operator (like SELECT * FROM test.numbers WHERE id IN (0,..,1000)
) I get the correct number of rows back. But for example for 0 to 100000 I always get only 34464 rows back. And for 0 to 75000 its 9464.
I am using the Datastax Java Driver 2.0 and the relevant codeparts look like the following:
String query = "SELECT * FROM test.numbers WHERE id IN ?;";
PreparedStatement ps = iot.getSession().prepare(query);
bs = new BoundStatement(ps);
List<Integer> ints = new ArrayList<Integer>();
for (int i = 0; i < 100000; i++) {
ints.add(i);
}
bs.bind(ints);
ResultSet rs = iot.getSession().execute(bs);
int rowCount = 0;
for (Row row : rs) {
rowCount++;
}
System.out.println("counted rows: " + rowCount);
It's also possible that I'm binding the list of Integers in a wrong way. If that's the case I would appreciate any hints too.
I am using Cassandra 2.0.7 with CQL 3.1.1.
This is not a real-limitation but a PreparedStatement one. Using a BuiltStatement and QueryBuilder I didn't have any of these problems.
Try it yourself:
List<Integer> l = new ArrayList<>();
for (int i = 0; i < 100000; i++) {
l.add(i);
}
BuiltStatement bs = QueryBuilder.select().column("id").from("test.numbers").where(in("id", l.toArray()));
ResultSet rs = Cassandra.DB.getSession().execute(bs);
System.out.println("counted rows: " + rs.all().size());
HTH, Carlo