I'm experimenting with RoaringBitMaps in Java using a custom Elasticsearch plugin and doing rBitmap.contains(1111)
works perfectly.
Now, if I do rBitmap.contains([1111,2222,3333])
, will not work as expected.
What I would expect is an OR
, but doesnt work that way.
What's the most efficient way to compare a RoaringBitMap to an array?, is there a method I'm missing?
I will experiment with a forloop that stops when it finds a contains coincidence, and converting the array into a RoaringMap and then running a cardinality function.
I'm on Java (this is a custom Elasticsearch plugin), trying to find something similar to pyroaring's isdisjoint
This is the part of the plugin that implements RoaringBitmap
@Override
public LeafFactory newFactory(
Map<String, Object> params,
SearchLookup lookup
) {
final byte[] decodedTerms = Base64.getDecoder().decode(params.get("terms").toString());
final ByteBuffer buffer = ByteBuffer.wrap(decodedTerms);
RoaringBitmap rBitmap = new RoaringBitmap();
try {
rBitmap.deserialize(buffer);
}
catch (IOException e) {
// Do something here
}
return new FastFilterLeafFactory(params, lookup, rBitmap);
}
}
@Override
public boolean execute() {
final int docVal;
try {
docVal = Math.toIntExact(docValues.nextValue());
} catch (IOException e) {
throw ExceptionsHelper.convertToElastic(e);
}
if (exclude && rBitmap.contains(docVal)) {
return false;
}
else return !include || rBitmap.contains(docVal);
}
full code here: https://github.com/lsena/fastfilter-elasticsearch-plugin/blob/master/src/main/java/org/elasticsearch/lsena/fastfilter/FastFilterPlugin.java
Thanks
From your code -
rBitmap.contains([1111,2222,3333])
This will check if the input array is a subset of the rBitMap
and that's not what you are looking for.
If you are looking for a way to check if the two roaring bitmaps intersect or not, you can use intersects
method defined here -
public static boolean intersects(final RoaringBitmap x1, final RoaringBitmap x2)
If you are looking for an elastic-search plugin that usages RoaringBitmap implementation to check if two large lists of integers (one stored as a field in elasticsearch document and the other being passed as a filter param in the query to elasticsearch) intersect or not, you can use this plugin - https://github.com/tata1mg/fastmatch-elasticsearch-plugin