Search code examples
pythonconnectionhttpconnection

Using httplib to connect to a website in Python


tl;dr: Used the httplib to create a connection to a site. I failed, I'd love some guidance!

I've ran into some trouble. Read about socket and httplib of python's, altough I have some problems with the syntax, it seems.

Here is it:

connection = httplib.HTTPConnection('www.site.org', 80, timeout=10, 1.2.3.4)

The syntax is this:

httplib.HTTPConnection(host[, port[, strict[, timeout[, source_address]]]])

How does "source_address" behave? Can I make requests with any IP from it? Wouldn't I need an User-Agent for it?

Also, how do I check if the connect is successful?

if connection:
print "Connection Successful."

(As far as I know, HTTP doesn't need a "are you alive" ping every one second, as long as both client & server are okay, when a request is made, it'll be processed. So I can't constantly ping.)


Solution

  • Creating the object does not actually connect to the website:
    HTTPConnection.connect(): Connect to the server specified when the object was created.

    source_address seems to be sent to the server with any request, but it doesn't seem to have any effect. I'm not sure why you'd need to use a User-Agent for it. Either way, it is an optional parameter.

    You don't seem to be able to check if a connection was made, either, which is strange.

    Assuming what you want to do is get the contents of the website root, you can use this:

    from httplib import HTTPConnection
    conn = HTTPConnection("www.site.org", 80, timeout=10)
    conn.connect()
    
    conn.request("GET", "http://www.site.org/")
    resp = conn.getresponse()
    
    data = resp.read()
    print(data)
    

    (slammed together from the HTTPConnection documentation)

    Honestly though, you should not be using httplib, but instead urllib2 or another HTTP library that is less... low-level.