Search code examples
pythonpython-2.7urllib

Python urlopen return value


I'm trying to pass existing URLs as parameter to load it's HTML in a single txt file:

for line in open('C:\Users\me\Desktop\URLS-HERE.txt'):
 if line.startswith('http') and line.endswith('html\n') :
    fichier = open("C:\Users\me\Desktop\other.txt", "a")
    allhtml = urllib.urlopen(line)
    fichier.write(allhtml)
    fichier.close()

but i get the following error:

TypeError: expected a character buffer object

Solution

  • The value returned by urllib.urlopen() is a file like object, once you have opened it, you should read it with the read() method, as showed in the following snippet:

    for line in open('C:\Users\me\Desktop\URLS-HERE.txt'):
       if line.startswith('http') and line.endswith('html\n') :
          fichier = open("C:\Users\me\Desktop\other.txt", "a")
          allhtml = urllib.urlopen(line)
          fichier.write(allhtml.read())
          fichier.close()
    

    Hope this helps!