I am trying to write a python module to communicate with a fixed HTTP server on a hardware device in order to send data to it. I am able to send data correctly via curl
, but for some reason it does not work correctly when I use the requests
module in python.
I have confirmed (by using httpbin.org/post) that the two requests are identical, but for some reason only the one send via curl
actually works.
When I look at the tcpdumps of the two requests, I do see a difference: The initial handshake is essentially identical, and then the data is sent (in both cases) as three separate packets.
From curl
, the communication post-handshake looks like:
17:58:31.691251 IP CLIENT.56184 > SERVER.http: Flags [P.], seq 1:232, ack 1, win 29200, length 231: HTTP: POST /index.html HTTP/1.1
E.....@.@.....n:..n..x.P.......(P.r.5h..POST /index.html HTTP/1.1
User-Agent: curl/7.29.0
Host: SERVER
Accept: */*
Content-Length: 1258
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------61700007fd77
.........7.?`)+.
17:58:31.766389 IP SERVER.http > CLIENT.56184: Flags [.], ack 232, win 1817, length 0
E..(;.....Ks..n...n:.P.x...(....P.... ........................
17:58:32.692418 IP CLIENT.56184 > SERVER.http: Flags [P.], seq 232:486, ack 1, win 29200, length 254: HTTP
E..&..@.@.....n:..n..x.P.......(P.r.5...------------------------------61700007fd77
< Data for packet 2 >
..........8.?`..
17:58:32.856104 IP SERVER.http > CLIENT.56184: Flags [.], ack 486, win 1563, length 0
E..(;.....Km..n...n:.P.x...(....P.... ..............x...8.?`R.
17:58:32.856139 IP CLIENT.56184 > SERVER.http: Flags [P.], seq 486:1490, ack 1, win 29200, length 1004: HTTP
E.....@.@.....n:..n..x.P.......(P.r.8m..[ID]
< Data for packet 3 >
....8.?`...6....
17:58:32.919921 IP SERVER.http > CLIENT.56184: Flags [.], ack 1490, win 2048, length 0
E..(;.....Kl..n...n:.P.x...(....P....O..................8.?`O.
17:58:32.924255 IP SERVER.http > CLIENT.56184: Flags [P.], seq 1:121, ack 1490, win 2048, length 120: HTTP: HTTP/1.0 200 OK
E...;.....J...n...n:.P.x...(....P....o..HTTP/1.0 200 OK
Content-Type: text/javascript
Access-Control-Allow-Origin: *
Content-length: 0
Connection: close
........8.?`._.7
It's very clean: as I read this, we send the first packet, it is acknowledge, we send the second, etc., and eventually we close the connection after receiving a nice happy response.
However, the communication from requests doesn't work as well. Sample code to produce this is:
import requests
headers = {"User-Agent": "test client"}
files = {"binary": ("filename", "file contents", "application/octet-stream")}
data = {"type": "upload"}
requests.post("remote.host.url/index.html", data=data, files=files, headers=headers)
which produces a much dirtier output:
18:24:46.311756 IP CLIENT.56212 > SERVER.http: Flags [P.], seq 1:289, ack 1, win 29200, length 288: HTTP: POST /index.html HTTP/1.1
E..H..@.@.....n:..n....P.9.N..v.P.r.5...POST /index.html HTTP/1.1
Host: SERVER
User-Agent: test client
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 1247
Content-Type: multipart/form-data; boundary=d8a887dda41b5a35f61ccf79b26d7b4e
........^.?`.C..
18:24:46.311772 IP CLIENT.56212 > SERVER.http: Flags [.], seq 289:1313, ack 1, win 29200, length 1024: HTTP
E..(..@.@.....n:..n....P.9.n..v.P.r.8...--d8a887dda41b5a35f61ccf79b26d7b4e
< Data from packet 2 >
........^.?`+Z..
18:24:46.311777 IP CLIENT.56212 > SERVER.http: Flags [P.], seq 1313:1536, ack 1, win 29200, length 223: HTTP
E.....@.@.....n:..n....P.9.n..v.P.r.5`..
< Data from packet 3 >
................
18:24:46.525743 IP SERVER.http > CLIENT.56212: Flags [.], ack 289, win 1760, length 0
E..([D....,%..n...n:.P....v..9.nP....0..................^.?`..
18:24:46.800583 IP CLIENT.56212 > SERVER.http: Flags [.], seq 289:1313, ack 1, win 29200, length 1024: HTTP
E..(..@.@.....n:..n....P.9.n..v.P.r.8...--d8a887dda41b5a35f61ccf79b26d7b4e
< Data from packet 2, again >
........^.?`.../
18:24:46.803014 IP SERVER.http > CLIENT.56212: Flags [.], ack 1313, win 2048, length 0
E..([E....,$..n...n:.P....v..9.nP...................p...^.?`.R
18:24:46.803033 IP CLIENT.56212 > SERVER.http: Flags [P.], seq 1313:1536, ack 1, win 29200, length 223: HTTP
E.....@.@.....n:..n....P.9.n..v.P.r.5`..
< Data from packet 3, again >
.........^.?`k?.
18:24:46.813645 IP SERVER.http > CLIENT.56212: Flags [F.], seq 1, ack 1536, win 1825, length 0
E..([F....,#..n...n:.P....v..9.MP..!....................^.?`h.
18:24:46.813813 IP CLIENT.56212 > SERVER.http: Flags [F.], seq 1536, ack 2, win 29200, length 0
E..(..@.@.....n:..n....P.9.M..v.P.r.4...........^.?`...0
18:24:46.814339 IP SERVER.http > CLIENT.56212: Flags [.], ack 1537, win 1824, length 0
E..([G....,"..n...n:.P....v..9.NP.. ....................^.?`..
18:24:46.816550 IP CLIENT.56214 > SERVER.http: Flags [S], seq 1228421461, win 29200, options [mss 1460,sackOK,TS val 3666736130 ecr 0,nop,wscale 7], length 0
E..<.W@.@.8...n:..n....PI89U......r.4..........
................^.?`0..0....
18:24:46.817006 IP SERVER.http > CLIENT.56214: Flags [S.], seq 416609351, ack 1228421462, win 2048, options [mss 1460], length 0
E..,[H....,...n...n:.P.....GI89V`.......................^.?`..
18:24:46.817021 IP CLIENT.56214 > SERVER.http: Flags [.], ack 1, win 29200, length 0
E..(.X@.@.9...n:..n....PI89V...HP.r.4...........^.?`.0.0
18:24:46.817049 IP CLIENT.56214 > SERVER.http: Flags [P.], seq 1:289, ack 1, win 29200, length 288: HTTP: POST /index.html HTTP/1.1
E..H.Y@.@.7...n:..n....PI89V...HP.r.5...POST /index.html HTTP/1.1
Host: SERVER
User-Agent: test (EPICS base 7.0.4-E3-7.0.4-patch IOC)
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 1247
Content-Type: multipart/form-data; boundary=04a493e5def4d0baf76026663f63ae61
........^.?`.g.0
18:24:46.817063 IP CLIENT.56214 > SERVER.http: Flags [.], seq 289:1313, ack 1, win 29200, length 1024: HTTP
E..(.Z@.@.5...n:..n....PI8:v...HP.r.8...--04a493e5def4d0baf76026663f63ae61
< Data from packet 2, again! >
....p...^.?`.z.0
18:24:46.817068 IP CLIENT.56214 > SERVER.http: Flags [P.], seq 1313:1536, ack 1, win 29200, length 223: HTTP
E....[@.@.8/..n:..n....PI8>v...HP.r.5`..
< Data from packet 3, again! >
etc.
The first things that I note are that in this case, all three packets are sent before the first one is acknowledged; after that the second packet is sent, acknowledged, then the third packet is sent.
However, after this, the whole thing is sent again for some reason, and we never get an HTTP/1.0 200 OK
message together with a good response.
I know that the HTTP headers that are being sent between the two are slightly different, but even synchronising those does not fix the communication betwee the two. I also note that the packet size is different, but I cannot imagine that being an issue.
I also note that the packets sent via curl
all have the PUSH
flag set, but this is done inconsistently on the python side. But other than that, I don't really see a difference.
So my question is: Why are the two acting differently, and how can I get the python requests module to act more like curl
in this case?
Python's Requests does not support "Expect: 100-continue" ([1], [2]), and if you are communicating with a server that actually requires 100-continue for large posts (and it looks like that's the case), your best bet is to find a http library which supports it (for example libcurl/Pycurl)
it may not work to just manually add the Expect: 100-continue
header to the Requests http-request either, as the client is supposed to send that header, then wait for a 100 Continue
response, and THEN send the body, but when just adding the header to the request, that does not magically teach Requests that it has to "wait for the 100-continue response before sending the body", Requests will just immidiately send the body without waiting, so.. yeah, find a http library that actually natively support it. (like libcurl/pycurl)
.. and if you can be arsed, would be nice if you went to the relevant Requests feature request and voiced your support.