The code below works fine for small files (<100MB or so), but fails to larger ones (uncomment the second URL in line 5 to see the problem). What baffles me is that the failure is immediate, I guess as soon as tornado sees Content-Length header -- but from what I understood, streaming_callback should make it work with arbitrarily-large files.
import tornado, tornado.httpclient
def main():
url = "https://www.python.org/ftp/python/2.7.13/python-2.7.13.msi"
# url = "http://releases.ubuntu.com/16.04.1/ubuntu-16.04.1-desktop-amd64.iso?_ga=1.179801251.666251388.1483725275"
client = tornado.httpclient.AsyncHTTPClient()
request = tornado.httpclient.HTTPRequest(url=url, streaming_callback=on_chunk)
client.fetch(request,on_done)
total_data = 0
def on_done(response):
print total_data
print response
def on_chunk(chunk):
global total_data
total_data += len(chunk)
main()
tornado.ioloop.IOLoop.current().start()
I get:
19161088 HTTPResponse(_body=None,buffer=<_io.BytesIO object at 0x7f7a57563258>,code=200,effective_url='https://www.python.org/ftp/python/2.7.13/python-2.7.13.msi',error=None,headers=,reason='OK',request=,request_time=0.7110521793365479,time_info={})
when downloading with Python, but
0 HTTPResponse(_body=None,buffer=None,code=599,effective_url='http://releases.ubuntu.com/16.04.1/ubuntu-16.04.1-desktop-amd64.iso?_ga=1.179801251.666251388.1483725275',error=HTTP 599: Connection closed,headers=,reason='Unknown',request=,request_time=0.10775566101074219,time_info={})
when trying with Ubuntu...
streaming_callback
can work with any size file, but by default AsyncHTTPClient
still enforces a limit of 100MB. To increase this, use
AsyncHTTPClient.configure(None, max_body_size=1000000000)