Search code examples
pythonurllib2

How to send a urllib2 request with added white spaces


I am trying to send a request to open web page url that uses white spaces so that I can download a file from the page. In a normal browser i.e chrome when you enter the url into the address bar the file is automatically generated and you are asked to download it.

Instead of having to load a web browser every time I want a set of logs I am trying to create a python script that I can run that will do all the hard work for me.

Example:

url = http (ip-address)/supportlog.xml/getlogs&name=0335008 04-05-2013 12.46.47.zip 

Im using the command:

xml_page = opener.open((url))

I have been able to to download other zip files fine from the web sever I am connecting to, using the following command and some other lines of code.

But when i try the same command with the url with added white spaces.

urllib2 knocks off all of the white spaces meaning I get a syntax error back. Ideally you would change the url not to contain white spaces, but this isn't possible.

I have tried addressing the URL with %20 to replace the white spaces but this doesn't work and causes the sever to fail.

I understand you can use the urllib.quote tool, but not sure how to or even if this is the correct pass to go down.

Any help is welcome... I'm still learning python so please be kind.


Solution

  • In order to clean your url with whitespaces use urllib.quote like this:

    import urllib
    url = urllib.quote("http://www.example.com/a url with whitespaces")
    

    To download a file to cannot use functions like urllib2.urlopen. If you want to download a file using the urllib modules you need urllib.urlretrieve. However, requests is easier to grasp in the beginning.

    import requests
    response = requests.get(url)
    

    The response provides several useful functions:

    • response.text: The source code of the website or the content of the downloaded file.
    • response.status_code: Status code of your request. 200 is ok.

    You probably want to save your downloaded file somewhere. So open a file connection with open in binary mode and write the content of your response. Do not forget to close the file.

    your_file_connection = open('your_file', 'wb')
    your_file_connection.save(response.text)
    your_file_connection.flush()
    your_file_connection.close()
    

    Summary

    import urllib
    import requests
    
    url = urllib.quote("http://www.example.com/a url with whitespaces")
    response = requests.get(url)
    
    your_file_connection = open('your_file', 'wb')
    your_file_connection.save(response.text)
    your_file_connection.
    your_file_connection.close()
    

    requests Documentation: http://docs.python-requests.org/en/latest/