python objective-c twisted.internet nsoutputstream

NSOutputStream or twisted reactor merging TCP packet data

I'm working on some networking code for an iPhone application that interfaces with a Python Twisted backend. I've been running into a problem recently where it appears as though either my NSOutputStream is doubling up the payload on send OR twisted is doubling up the payload on receive.

I'm using the "Apple Recommended" style of TCP sockets, E.G. non-polling.

The process is as follows:
CLIENT
    - NSStreamEventHasSpaceAvailable: send a packet of X bytes of data
    - NSStreamEventHasSpaceAvailable: send another packet of Y bytes of data
SERVER
    - Twisted receives packet of size (X + Y) bytes

I'm making sure I explicitly don't send data if the status of the outputStream is "NSStreamStatusWriting". Also ensuring that data is not allowed to be sent from the client if NSStreamEventHasSpaceAvailable has not been thrown.

Any ideas as to what may be causing this double-up/merger of the payload? The Twisted code is fairly straight-forward, using the standard dataReceived in my Protocol:

    def dataRecieved(self, data):
        # do logic in order to decide how to handle data
        # ...
        # print of len(data) here reveals merged packet size

iOS code is fairly standard as well:

    if (eventCode == NSStreamEventHasSpaceAvailable)
    {
        [outputStream write:[packet getData] maxLength:[packet getPacketSize]];
    }
    // [packet getData] simply returns a standard UInt8 array. 
    // [packet getPacketSize] returns the size of that array.

When the above iOS code is called twice in a row (e.g., sending two packets one after another), the twisted code reports the merged data size.

Thanks in advance for any advice or suggestions.

Solution

I agree -- I shouldn't necessarily expect the buffer boundaries to match up, but I suppose it's a matter of predictable behavior.

There is no predictable behavior in TCP based communications; there may be any number of routers, NAT boundaries, goofy proxies or whatever in between you and the remote host that dictate behavior that appears unexpected.

Heck, there might even be carrier pigeons carrying your packets byte by byte.

Real world, though, the behavior is generally fairly predictable. But not always, never 100% of the time, and always with the potential of some customer somewhere with clogged tubes.

With TCP you are, at least, guaranteed that generally packets will be received in the order they are sent unless there is an error. Assuming, again, that all points between are either implemented correctly or non-malicious (that latter bit means that you have to assume that sometimes data will be corrupted).

Even that guarantee doesn't mean much; you may receive the first 8 of 10 packets... or you might receive all of the in-bound data only to find the out-bound connection is dead when you go to respond....

Bottom line; your buffering algorithm on both sides must assume that the buffer may be filled in random bursts that completely mismatch the size of data stuck in the other side in the first place. While not strictly required, your app will be best served by defending against random connection failures; against truncated buffers, randomly broken connections and data corruption.

Early byte-length fields & checksums are your friend. Assumptions are you !hjfdahjdas8y!$(&($@#&^@#)^&!@#&_[CONNECTION LOST]