Search code examples
couchbasebloom-filter

Couchbase 6.5.1 and Bloom filters


I'm trying to assess whether or not using bloom filters is a good idea for my Couchbase deployment. I'm using CB 6.5.1 on a value-only ejection mode. Searching in the official docs it's not clear to me when bloom filters are available. Furthermore, I can only find a mention of their use only on versions 5.0 and 5.1. More specifically on version 5.0, in the Database Engine Architecture section one reads

Full metadata ejection removes all data including keys, metadata, and key-value pairs from the cache for non-resident items. Full ejection is well suited for cases where the application has cold data that is not accessed frequently or the total data size is too large to fit in memory plus higher latency access to the data is accepted. The performance of full eviction cache management is significantly improved by Bloom filters. Bloom filters are enabled by default and cannot be disabled.

So does this mean that they are only available on full ejection mode?

The other page that I can find only in version 5.0 and 5.1 is the this one which just describes the functionality of bloom filters in combination with full ejection and XDCR.

So what is going on in version 6.5.x ? Are bloom filters only used in full ejection mode by default and cannot be disabled? Can they be configured somewhere? Can somebody use them in combination with value-only ejection mode?


Solution

  • A Couchbase bucket in value-only ejection mode has all of the keys for the bucket in metadata, so the benefits of a bloom filter are minimal for most operations as it’s faster to look in the internal memory structures to check if a key exists or not. That said, bloom filters are used in value eviction to improve detection of deleted keys as these are not resident in memory but their tombstones do reside on disk.

    Bloom filter do still exist in the latest Couchbase Server versions, upto and including Couchbase Server 7.0. For example, on my 6.5.1 cluster, I have a value-only bucket called travel-sample. I can see the bloom filter information by using the cbstats CLI command.

    :~$ /opt/couchbase/bin/cbstats -u Administrator -p password -b travel-sample localhost:11210 all | grep bfilter
    
    ep_bfilter_enabled: true
    ep_bfilter_fp_prob: 0.01
    ep_bfilter_key_count: 10000
    ep_bfilter_residency_threshold: 0.1
    

    There’s 2 configuration options for bloom filters that are modified with the cbepctl command:

    bfilter_enabled - Enable or disable bloom filters (true/false)

    bfilter_residency_threshold - Resident ratio threshold below which all items will be considered in the bloom filters in full

    For example; :~$ /opt/couchbase/bin/cbepctl localhost:11210 -b travel-sample -u Administrator -p password set flush_param bfilter_enabled false

    setting param: bfilter_enabled false
    set bfilter_enabled to false
    

    You can see it’s now disabled.

    :~$ /opt/couchbase/bin/cbstats -u Administrator -p password -b travel-sample localhost:11210 all | grep bfilter
    ep_bfilter_enabled: false
    ep_bfilter_fp_prob: 0.01
    ep_bfilter_key_count: 10000
    ep_bfilter_residency_threshold: 0.1
    eviction policy (0.0 - 1.0)
    

    Thank you, Ian McCloy (Principal Product Manager, Couchbase)