Search code examples
javaphpsocketstcpbitstuffing

How do I account for messages being broken up when using sockets?


My Design

I'm using sockets to implement a chat server.

The client side uses Java's java.net.Socket and BufferedReader to read messages from the server.

The server side uses Php's socket_read() to get messages from the clients.

And it uses Php's socket_write() to send messages from the server. socket_write() does not guarantee that the entire original message will be written out, which means I may have to make multiple calls to it to send out the entire original message.

(In terms of design, clients send messages to the server, and server reroutes those messages to the appropriate clients.)

Concerns

My concerns are that a message may be broken up into several smaller messages. So when the server or a client reads an incoming message, it may actually be a fragment of the original.

Questions

Is this something I need to account for? If yes, how?

Possible Solution

Right now I'm thinking about using byte stuffing (which is a networking technique to insert bytes into the original message that serve as flags to mark the start and end of a message before sending it out).


Solution

  • Yes, this is something you need to handle in your protocol.

    The two most typical approaches here are:

    1. Make your protocol line-oriented. Terminate every message with a newline, and don't treat a line as complete until you see that newline character. This, of course, depends on newlines not naturally appearing in messages.

      Some protocols which use this approach include SMTP, IMAP, and IRC.

    2. Include the length of the message in its header, so that you know how much data to read.

      Some protocols which use this approach include HTTP (in the Content-Length header) and TLS, as well as many low-level protocols such as IP.

    If you aren't sure which approach to take, the second one is considerably easier to implement, and doesn't place any restrictions on what data you use it with. A simple implementation might simply store the count of bytes as a packed integer, and could look like the following pseudocode:

    send_data(dat):
        send(length of dat as packed integer)
        send(dat)
    
    recv_data():
        size = recv(size of packed integer)
        return recv(buffer)
    

    (This code assumes that the abstract send() and recv() methods will block until the entire message is sent or received. Your code will, of course, have to make this work appropriately.)