So I'm trying to get the source code of google using only python sockets and not any other libraries such as urllib. I don't understand why my GET request isn't working, I tried all possible methods. This is the code I have, it's pretty small and I don't wanna get too much details. Just looking for a protocol that's used to get source codes. I assumed it would be the GET
method but it doesn't work. I need a response that resembles urllib.request but using python sockets only.
socket.gethostbyname()
, it fails on the getaddrinfo.import socket;
s=socket.socket();
host=socket.gethostbyname("www.google.com");
port=80;
send_buf="GET / \r\n"\
"Host: www.google.com\r\n";
s.connect((host, port));
s.sendall(bytes(send_buf, encoding="utf-8"));
data="";
part=None;
while( True ):
part=s.recv(2048);
data+=str(part, "utf-8");
if( part==b'' ):
break;
s.close();
The following worked for me:
import socket
s=socket.socket()
host=socket.gethostbyname('www.google.com')
port=80
s.connect((host,port))
s.sendall("GET /\r\n")
val = s.recv(10000)
# Split off the HTTP headers
val = val.split('\r\n\r\n',1)[1]