From the http2 spec:
Each header block is processed as a discrete unit. Header blocks MUST be transmitted as a contiguous sequence of frames, with no interleaved frames of any other type or from any other stream. The last frame in a sequence of HEADERS or CONTINUATION frames has the END_HEADERS flag set. The last frame in a sequence of PUSH_PROMISE or CONTINUATION frames has the END_HEADERS flag set. This allows a header block to be logically equivalent to a single frame.
I see how "transmitted as a contiguous sequence of frames, with no interleaved frames of any other type or from any other stream" enables "a header block to be logically equivalent to a single frame", but why is logical equivalence to a single frame so important as to veer from how the rest of http2 works?
Is it important because the processing of a header block changes the state of the connection held by the HPACK context.
To process a sequence like:
HEADERS[stream=3,end_headers=false]
,
HEADERS[stream=5,end_headers=true]
,
CONTINUATION[stream=3,end_headers=true]
you would need to wait for the CONTINUATION
frame to arrive before processing the HEADERS
frame for stream=5
.
Now, imagine a malicious client that does not send that CONTINUATION
frame: it would have the ability to completely halt the processing of frames on the server and forcing the server to buffer all the arriving frames, waiting for the CONTINUATION
frame - not good.
So to make things simple in the specification and the protocol, it was mandated that CONTINUATION
frames cannot be interleaved, which is a special treatment with respect to all the other HTTP/2 frames that can be interleaved.