Search code examples

Issuing multiple requests before getting response

I'm having trouble understanding how HTTP works when multiple requests are send parallely (before getting a response). There are two cases:

1) With Connection: Keep-Alive.

According to HTTP spec:

A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received.

That way seems to be quite difficult to implement and maintain. The server has to keep track of the order of requests and has to respond in correct order. Not only it might not be easy to implement but there's a performance hit: fast requests have to wait until slow requests are processed if they were issued later.

Also if we are talking about a load balancer then the proxy has to keep track of which request was send to which server so when they come back it can put them in queue and respond in order. So why not make that way in the first place? I.e. it sounds more natural and easier that a client puts (for example) ID header, the server processes the request and responds with the same ID header so that the client can match request with response. That is a lot easier to implement and it does not introduce problems with queueing requests (it is up to the client to track the order of requests if it is necessary).

So the question is: what's the reason to specify pipelining in the way it was specified?

2) Without Connection: Keep-Alive.

I couldn't find any info about that case. Let's say that a client issues two requests A and B. Without keep-alive the server will close the connection after processing the request. This obviously introduces a race condition. So how should it behave? Should it discard the second request?


  • 1) With Keep-Alive :

    According to this wikipedia article ( it is the opposite : implementation on the server side is actually very simple. I believe this affirmation is based on the assumption that a single thread is used to handle all requests for a single connection (which was certainly the general case when this mechanism was designed) and thus several requests on the same connection are handled sequentially by this thread (and as TCP guarantees ordered delivery, responses are naturally received in the same order they are processed). It might be different today on a non-blocking server implementation.

    2) Without Keep-alive :

    Without keep-alive, you do not pipeline requests, so I do not see the race condition. You have two separate connections for requests A and B, each connection is closed when the request has been completed.

    If the client tries to pipeline requests without keep-alive, I believe the following section of the specification applies :

    Clients which assume persistent connections and pipeline immediately after connection establishment SHOULD be prepared to retry their connection if the first pipelined attempt fails. If a client does such a retry, it MUST NOT pipeline before it knows the connection is persistent. Clients MUST also be prepared to resend their requests if the server closes the connection before sending all of the corresponding responses.

    My interpretation is that the server must legitimately discard the second request and only respond to the firts one, as responses are FIFO. It's up the client to resend the second request.

    Keep in mind : this are mostly suppositions on my side, I hope they make sense to you!