Search code examples
apache-pulsar

Topic replication in Apache Pulsar


The documentation on replication in Pulsar is not very descriptive. I am wondering how the replication works in detail and how the persistence policies for a namespace play into this. The documentation talks about these parameters

  • bookkeeper-ack-quorom: The number of acks (guaranteed copies) to wait for each entry
  • bookkeeper-ensemble: The number of bookies to use for a topic
  • bookkeeper-write-quorum: How many writes to make of each entry

Does bookkeeper-ack-quorom mean, that the ack to the client is delayed until this number of bookies have written the entry to disk?

What is the difference between bookkeeper-ensemble and bookkeeper-write-quorum?

Lets assume I have 3 bookies and I want topics in the namespace to reside on each of them, then I set both values to 3?


Solution

  • Does bookkeeper-ack-quorom mean, that the ack to the client is delayed until this number of bookies have written the entry to disk?

    That's correct. If your ack-quorum is 2 it means you will have 2 guaranteed copies of the message when the publish is successful. In the default configuration, that will mean that the message is written to disk and flushed (fsynced) to disk on 2 machines.

    What is the difference between bookkeeper-ensemble and bookkeeper-write-quorum?

    Ensemble is the number of bookies to be used for a ledger. Most of the time this is configured to be equal than write quorum.

    Setting ensemble > write-quorum will enable "striping of entries across multiple bookies within a single topic.

    For example, setting e=5 w=2 a=2 will make:

    • Each message is written in 2 copies and we wait for 2 acks
    • Messages are striped in round-robin across 5 bookies
    • Each bookie will have a subset of the messages 2/5
    • Each bookie will have a small write/read traffic

    Basically it allows to scale up the IO for a single ledger without relaxing ordering.

    Lets assume I have 3 bookies and I want topics in the namespace to reside on each of them, then I set both values to 3?

    Correct. Although ensemble also represent the minimum set of bookies that need to be available in order for writes to be accepted.

    If you have 3 bookies and set ensemble=3, you won't be able to tolerate a node failure.