I can download a file from URL
the following way.
import urllib2
response = urllib2.urlopen("http://www.someurl.com/file.pdf")
html = response.read()
One way I can think of is open this file as binary and then resave it to the differnet folder I want to save
but is there a better way?
Thanks
The function you are looking for is urllib.urlretrieve
import urllib
linkToFile = "http://www.someurl.com/file.pdf"
localDestination = "/home/user/local/path/to/file.pdf"
resultFilePath, responseHeaders = urllib.urlretrieve(linkToFile, localDestination)
UPD from 2023: Although urlretrieve is still available as urllib.request.urlretrieve it is reported to be in danger of deprecations. I can't find any built-in solution with the same interface and the best solution I could offer for those who want to keep their code up-to-date is to download file with requests lib and then to dump it to file:
import requests
linkToFile = "http://www.someurl.com/best_file.pdf"
localDestination = "/home/user/path/to/file.pdf"
response = requests.get(linkToFile)
# just in case of any problem
response.raise_for_status()
with open(localDestination, 'wb') as file:
file.write(response.content)
If you are not sure about the size of the file, you could download it in chunks:
import requests
linkToFile = "http://www.someurl.com/file.pdf"
localDestination = "/home/user/local/path/to/file.pdf"
# add streamming option for get
response = requests.get(linkToFile, stream=True)
response.raise_for_status()
#
with open(localDestination, 'wb') as file:
# you could set an arbitrary chunk size
for chunk in response.iter_content(chunk_size=8192):
file.write(chunk)
Or you could use wget as suggested in the answer below (but this lib seems to have no updates since 2015):
import wget
url = "http://someurl.com/file.pdf"
wget.download(url, '/path/to/file.pdf')