I am new to bloom filter concept. Please let me know your thoughts on this. I have 3 types of categories. Each type contains billions of categories.
Do I need 3 bloom filter objects or is there any way to manage all the category types in object?
I am using Apache hadoop bloom filter implementation i.e org.apache.hadoop.util.bloom.Filter
. Is there any other implementation better than this?
What should be the ideal bit array size to handle billion records?
Do I need 3 bloom filter objects: depending on what you want to do (you didn't describe that), yes.
Is there any other implementation: sure! Try using Google.
Ideal bit array size: it depends on what you want to do. Try reading the Wikipedia article about Bloom filters. There are formulas to calculate the probability.