I am trying to send a request to open web page url that uses white spaces so that I can download a file from the page. In a normal browser i.e chrome when you enter the url into the address bar the file is automatically generated and you are asked to download it.
Instead of having to load a web browser every time I want a set of logs I am trying to create a python script that I can run that will do all the hard work for me.
Example:
url = http (ip-address)/supportlog.xml/getlogs&name=0335008 04-05-2013 12.46.47.zip
Im using the command:
xml_page = opener.open((url))
I have been able to to download other zip files fine from the web sever I am connecting to, using the following command and some other lines of code.
But when i try the same command with the url with added white spaces.
urllib2 knocks off all of the white spaces meaning I get a syntax error back. Ideally you would change the url not to contain white spaces, but this isn't possible.
I have tried addressing the URL with %20 to replace the white spaces but this doesn't work and causes the sever to fail.
I understand you can use the urllib.quote
tool, but not sure how to or even if this is the correct pass to go down.
Any help is welcome... I'm still learning python so please be kind.
In order to clean your url with whitespaces use urllib.quote
like this:
import urllib
url = urllib.quote("http://www.example.com/a url with whitespaces")
To download a file to cannot use functions like urllib2.urlopen
. If you want to download a file using the urllib
modules you need urllib.urlretrieve
. However, requests
is easier to grasp in the beginning.
import requests
response = requests.get(url)
The response
provides several useful functions:
response.text
: The source code of the website or the content of the downloaded file.response.status_code
: Status code of your request. 200 is ok.You probably want to save your downloaded file somewhere. So open a file connection with open
in binary mode and write the content of your response. Do not forget to close the file.
your_file_connection = open('your_file', 'wb')
your_file_connection.save(response.text)
your_file_connection.flush()
your_file_connection.close()
Summary
import urllib
import requests
url = urllib.quote("http://www.example.com/a url with whitespaces")
response = requests.get(url)
your_file_connection = open('your_file', 'wb')
your_file_connection.save(response.text)
your_file_connection.
your_file_connection.close()
requests
Documentation: http://docs.python-requests.org/en/latest/