Search code examples
pythonhtmlparsingpython-3.xurllib

Save HTML of some website in a txt file with python


I need save the HTML code of any website in a txt file, is a very easy exercise but I have doubts with this because a have a function that do this:

import urllib.request

def get_html(url):
    f=open('htmlcode.txt','w')
    page=urllib.request.urlopen(url)
    pagetext=page.read() ## Save the html and later save in the file
    f.write(pagetext)
    f.close()

But this doesn't work.


Solution

  • Easiest way would be to use urlretrieve:

    import urllib
    
    urllib.urlretrieve("http://www.example.com/test.html", "test.txt")
    

    For Python 3.x the code is as follows:

    import urllib.request    
    urllib.request.urlretrieve("http://www.example.com/test.html", "test.txt")