Search code examples
pythontcppython-requests

Python Requests splits TCP packet


I am trying to script a HTTP POST request with python.

When trying it with curl from bash, everything is working. With python, using either the requests or the urllib3-library, I am getting an error response from the API. The POST request contains information in headers and as json in the request body.

What I noticed, when I intercept the packets with Wireshark, the curl-request (which is working) is one single packet of length 374 bytes. The python-request (no difference between requests and urllib3 here) is splitted into 2 separate packets of 253 and 144 bytes length.

enter image description here

Wireshark reassembles these without problems and they both seem to contain the complete information in header and POST body. But the API I am trying to connect to answeres with a not very helpful "Error when processing request".

As the 253 bytes can't be the limit of a TCP-packet, what is the reason for that behavior? Is there a way to fix that?

EDIT:

bash:

curl 'http://localhost/test.php' -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36' -H 'Content-Type: application/json' -d '{"key1":"value1","key2":"value2","key3":"value3"}'

python:

import requests, json

headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36",
"Content-Type": "application/json"}

data = {"key1":"value1", "key2":"value2", "key3":"value3"}

r=requests.post("http://localhost/test.php", headers=headers, data=json.dumps(data))

Solution

  • TCP is a data stream and not a series of messages. The segmentation of the data stream into packets should be of no relevance to the interpretation of the data stream, neither in sender nor recipient. If the recipients actually behaves differently based on the segmentation of the packets the the recipient is broken.

    While I've seen such broken systems I've seen more systems which do not like the request for different reasons, like wrong user agent, missing accept header or similar. I would suggest you check this first before concluding that it must be the segmentation of the data stream.

    As for why curl and requests behave differently: probably curl first constructs the full request (header and body) and sends it while requests constructs first the header and sends it and then sends the body, i.e. does two write operations which might result in two packets.