Search code examples
javarabbitmqamqp

RabbitMQ Batch Ack


I had a question on how rabbitmq works with batching acknowledgements. I understand that the Prefetch value is the max number of messages that will get queued before reaching its limit. However, I wasn't sure if the ack's manage themselves or if I have to manage this in code.

Which method is correct?

Send each basicAck with multiple set to true

or

wait until 10 acks were supposed to be sent out and send only the last one and AMQP will automatically send all previous in queue. (with multiple set to true)


Solution

  • TL;DR multiple = true is faster in some cases but requires a lot more careful book keeping and batch like requirements

    The consumer gets messages that have a monotonic-ly growing id specific to that consumer. The id is a 64 bit number (it actually might be an unsigned 32 bit but since Java doesn't have that its a long) called the delivery tag. The prefetch is the most messages a consumer will receive that are unacked.

    When you ack the highest delivery tag with multiple true it will acknowledge all the unacked messages with a lower delivery tag (smaller number) that the consumer has outstanding. Obviously if you have high prefetch this is faster than acking each message.

    Now RabbitMQ knows the consumer received the messages (the unacked ones) but it doesn't know if all those messages have been correctly consumed. So it is on the burden of you the developer to make sure all the previous messages have been consumed. The consumer will deliver the messages in order (I believe internally the client uses a BlockingQueue) but depending on the library/client used downstream the messages might not be.

    Thus this really only works well when you are batching the messages together in a single go (e.g. transaction or sending a group of messages off to some other system) or buffering reliably. Often this is done with a blocking queue and then periodically draining the queue to send a group of messages to a downstream system.

    On the other hand if you are streaming each message in real time then you can't really do this (ie multiple = false).

    There is also the case of one of the message being bad in the group (e.g. drained from internal queue... not rabbit queue) and you won't to nack that bad one. If that is the case you can't use multiple = true either.

    Finally if you wait for a certain amount messages (instead of say time) more than the prefetch you will wait indefinitely.... not a good idea. You need to wait on time and number of messages must be <= prefetch.

    As you can see its fairly nontrivial to correctly use multiple = true.