Search code examples
pythonsocketsnetwork-programmingclient

A python socket client that outputs the source code of a website, why isn't this working?


The following code doesn't output anything(why?).

#!/usr/bin/python           
import socket             

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)                 

s.connect(("www.python.org" , 80))
print s.recv(4096)
s.close()

What do I have to change in order to output the source code of the python website as you would see when you go to view source in a browser?


Solution

  • HTTP is request/response protocol. You're not sending any request, thus you're not getting any response.

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)                 
    
    s.connect(("www.python.org" , 80))
    s.sendall("GET /\r\n") # you're missing this line
    print s.recv(4096)
    s.close()
    

    Of course that will do the most raw HTTP/1.0 request, without handling HTTP errors, HTTP redirects, etc. I would not recommend it for actual usage beyond doing it as an exercise to familiarize yourself with socket programming and HTTP.

    For HTTP Python provides few built in modules: httplib (bit lower level), urllib and urllib2 (high level ones).