I had a discussion with a developer earlier today re identifying TCP packets going out on a particular interface with the same payload. He told me that the probability of finding a TCP packet that has an equal payload (even if the same data is sent out several times) is very low due to the way TCP packets are constructed at system level. I was aware this may be the case due to the system's MTU settings (usually 1500 bytes) etc., but what sort of probability stats am I really looking at? Are there any specific protocols that would make it easier identifying matching payloads?
EDIT: Sorry, my original idea was ridiculous.
You got me interested so I googled a little bit and found this. If you wanted to write your own tool you would probably have to inspect each payload, the easiest way would probably be some sort of hash/checksum to check for identical payloads. Just make sure you are checking the payload, not the whole packet.
As for the statistics I will have to defer to someone with greater knowledge on the workings of TCP.