Search code examples
bitmaproaring-bitset

RoaringBitMap contains() array?


I'm experimenting with RoaringBitMaps in Java using a custom Elasticsearch plugin and doing rBitmap.contains(1111) works perfectly.

Now, if I do rBitmap.contains([1111,2222,3333]) , will not work as expected.

What I would expect is an OR, but doesnt work that way.

What's the most efficient way to compare a RoaringBitMap to an array?, is there a method I'm missing?

I will experiment with a forloop that stops when it finds a contains coincidence, and converting the array into a RoaringMap and then running a cardinality function.

I'm on Java (this is a custom Elasticsearch plugin), trying to find something similar to pyroaring's isdisjoint

This is the part of the plugin that implements RoaringBitmap

            @Override
            public LeafFactory newFactory(
                    Map<String, Object> params,
                    SearchLookup lookup
                    ) {
                final byte[] decodedTerms = Base64.getDecoder().decode(params.get("terms").toString());
                final ByteBuffer buffer = ByteBuffer.wrap(decodedTerms);
                RoaringBitmap rBitmap = new RoaringBitmap();
                try {
                    rBitmap.deserialize(buffer);
                }
                catch (IOException e) {
                    // Do something here
                }
                return new FastFilterLeafFactory(params, lookup, rBitmap);
            }
        }
                    @Override
                    public boolean execute() {
                        final int docVal;
                        try {
                            docVal = Math.toIntExact(docValues.nextValue());
                        } catch (IOException e) {
                            throw ExceptionsHelper.convertToElastic(e);
                        }

                        if (exclude && rBitmap.contains(docVal)) {
                            return false;
                        }
                        else return !include || rBitmap.contains(docVal);
                    }

full code here: https://github.com/lsena/fastfilter-elasticsearch-plugin/blob/master/src/main/java/org/elasticsearch/lsena/fastfilter/FastFilterPlugin.java

Thanks


Solution

  • From your code - rBitmap.contains([1111,2222,3333])

    This will check if the input array is a subset of the rBitMap and that's not what you are looking for.

    If you are looking for a way to check if the two roaring bitmaps intersect or not, you can use intersects method defined here -

    public static boolean intersects(final RoaringBitmap x1, final RoaringBitmap x2)
    

    If you are looking for an elastic-search plugin that usages RoaringBitmap implementation to check if two large lists of integers (one stored as a field in elasticsearch document and the other being passed as a filter param in the query to elasticsearch) intersect or not, you can use this plugin - https://github.com/tata1mg/fastmatch-elasticsearch-plugin