I'm looking for a production quality bloom filter implementation in Python to handle fairly large numbers of items (say 100M to 1B items with 0.01% false positive rate).
Pybloom is one option but it seems to be showing its age as it throws DeprecationWarning errors on Python 2.5 on a regular basis. Joe Gregorio also has an implementation.
Requirements are fast lookup performance and stability. I'm also open to creating Python interfaces to particularly good c/c++ implementations, or even to Jython if there's a good Java implementation.
Lacking that, any recommendations on a bit array / bit vector representation that can handle ~16E9 bits?
Eventually I found pybloomfiltermap. I haven't used it, but it looks like it'd fit the bill.