Search code examples
pythonsocketssocketserver

python socketserver occasionally stops sending (and receiving?) messages


I've been experiencing a problem with a socketserver I wrote where the socketserver will seem to stop sending and receiving data on one of the ports it uses (while the other port continues to handle data just fine). Interestingly, after waiting a minute (or up to an hour or so), the socketserver will start sending and receiving messages again without any observable intervention.

I am using the Eventlet socketing framework, python 2.7, everything running on an ubuntu aws instance with external apps opening persistent connections to the socketserver.

From some reading I've been doing, it looks like I may not be implementing my socket server correctly. According to http://docs.python.org/howto/sockets.html:

fundamental truth of sockets: messages must either be fixed length (yuck), or be delimited > > (shrug), or indicate how long they are (much better), or end by shutting down the connection.

I am not entirely sure that I am using a fix length message here (or am I?)

This is how I am receiving my data:

def socket_handler(sock, socket_type):
    logg(1,"socket_handler:initializing")
    while True:
        recv = sock.recv(1024)
        if not recv:
            logg(1,"didn't recieve anything")
            break
        if len(recv) > 5:
            logg(1,"socket handler: %s" % recv )
            plug_id, phone_sid, recv_json = parse_json(recv)
            send = 1 
            if "success" in recv_json and recv_json["success"] == "true" and socket_type == "plug":
                send = 0
            if send == 1:
                send_wrapper(sock, message_relayer(recv, socket_type))
        else:
            logg(2, 'socket_handler:Ignoring received input: ' + str(recv)  )
    logg(1,  'Closing socket handle: [%s]' % str(sock))
    sock.shutdown(socket.SHUT_RDWR)
    sock.close()

"sock" is a socket object returned by the listener.accept() function.

The socket_handler function is called like so:

new_connection, address = listener.accept()
...<code omitted>...
pool.spawn_n(socket_handler, new_connection, socket_type)

Does my implementation look incorrect to anyone? Am I basically implementing a fixed length conversation protocol? What can I do to help investigate the issue or make my code more robust?

Thanks in advance,

T


Solution

  • You might be having buffering related problems if you're requesting to receive more bytes at the server (1024) than you're actually sending from the client.

    To fix the problem, what's is usually done is encode the length of the message first and then the message itself. This way, the receiver can get the length field (which is of known size) and then read the rest of the message based on the decoded length.

    Note: The length field is usually as many bytes long as you need in your protocol. Some protocols are 4-byte aligned and use a 32 bit field for this, but if you find that you've got enough with 1 or 2 bytes, then you can use that. The point here is that both client and server know the size of this field.