I have a use case where I want to check (from within a python/Django project) if a response to a GET
request is smaller than x
bytes, if the whole response completes within y
seconds and if the response status is 200. The URL being tested is submitted by end users.
Some constraints:-
HEAD
request is not acceptable. Simply because some servers might not include a Content-Length
, or lie about it, or simply block HEAD
requests.GET
response body. Imagine end user submitting url to 10GB file... all my server bandwidth(and memory) would be consumed by this.tl;dr : Is there any python http api that:-
RST
) once x
bytes have been received to avoid bandwidth starvation.The x
here would probably be in order of KBs, y
would be few seconds.
You could open the URL in urllib
and read(x+1)
from the returned object. If the length of the returned string is x+1
, then the resource is larger than x
. Then call close()
on the object to close the connection, i.e. kill the request. In the worst case, this will fill the OS's TCP buffer, which is something you can not avoid anyway; usually, this should not fetch more than a few kB more than x
.
If you furthermore add a Range
header to the request, sane servers will close the connection themselves after x+1
bytes. Note that this changes the reply code to 206 Partial Content
, or 416 Requested range not satisfiable
if the file is too small. Servers which do not support this will ignore the header, so this should be a safe measure.