Search code examples
aeron

What type of message delivery guarantees Aeron can provide?


What message delivery guarantees Aeron give me as a messaging framework (at least once, at most once, exactly once)?


Solution

  • Thinking about Aeron in those terms doesn't really help your understanding as Aeron is flexible with regards to it's reliability options - especially when Archive and Cluster are also considered. The best response would be "at most once", but that hides a lot of detail.

    As it states in the README.md, Aeron is a layer 4 protocol. So the reliability options are similar to what you would get with TCP. Lost messages are redelivered as long as the remote ends are alive and heartbeating. However this general requirement can be loosened. For multi-node distribution (multicast and MDC) the flow control setting can affect reliability. For example using max strategy means that if one of the receivers reports that it is up to date the flow control window can move forward and other slower receivers may end up missing data that can not be nakked. Even using the min strategy slower receivers that are unresponsive for long enough can drop out of the flow control group and be left behind. There is also options to disable nakking from receivers so any loss will be gap-filled.

    To allow recovery in these situations is where Archive comes in as it allows messages to be stored to enable "late join" behaviour. Therefore if once the archive has acknowledged that a message is stored, then you could consider this to be "exactly once" in practical terms.

    Cluster takes this to the next level by using a quorum agreement to prevent message loss in the case of node failure.