Search code examples
pythonpython-3.xsockets

python retrieving web data


I am new at Python and I have been trying to figure out the following exercise.

Exercise 5: (Advanced) Change the socket program so that it only shows data after the headers and a blank line have been received. Remember that recv is receiving characters (newlines and all), not lines.

I attached below the code I came up with, unfortunately I don't think it is working:

import socket
mysocket=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
mysocket.connect(('data.pr4e.org', 80))
mysocket.send('GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode())

count=0
while True:
          data = mysocket.recv(200)

          if (len(data) < 1): break  

          count=count+len(data.decode().strip())
          print(len(data),count)
          if count >=399:
                 print(data.decode(),end="")         
mysocket.close()

Solution

  • Instead of counting the number of lines received, just grab all the data you get and then split on the first double CRLF you find.

    resp = []
    while True:
              data = mysocket.recv(200)
    
              if not data: break  
              resp.append(data.decode())
    mysocket.close()
    
    resp = "".join(resp)
    body = resp.partition('\r\n\r\n')[2]
    print(body)