In developing a system that sends out or relays emails, network timeout by the remote (SMTP) host or a system failure at critical moments can lead to a resolution where it's impossible for the sender to determine if the email has been forwarded.
In order to implement guaranteed delivery, at-least-once semantics seems to be the likely option. In regards to this, is it reasonable to assume that deduplication will occur somewhere downstream if the same Message-Id
is used when re-sending emails?
Deduplication by using Message-ID
could be performed by specific mail clients, but the RFC for SMTP makes no mention of this. Instead, it states implementations should strive on reducing the critical timing:
RFC 5321 "Simple Mail Transfer Protocol" under 6.1. Reliable Delivery and Replies by Email reads:
To avoid receiving duplicate messages as the result of timeouts, a receiver-SMTP MUST seek to minimize the time required to respond to the final <CRLF>.<CRLF> end of data indicator. See RFC 1047 for a discussion of this problem.
And RFC 1047 "DUPLICATE MESSAGES AND SMTP" states under AVOIDING SYNCHRONIZATION PROBLEMS:
The best way to avoid the synchronization problem is to minimize the length of the synchronization gap [in the state between the final dot]. In other words, receiving mailers should acknowledge the final dot as soon as possible and do more complex processing of the message later.