Search code examples
pythonhtmlbeautifulsoupwordnet

Encountering Error while using BeautifulSoup


I am trying to extract the words(verbs) starting with R from this page. But on executing the following code:

from bs4 import BeautifulSoup
import urllib2
url = "http://www.usingenglish.com/reference/phrasal-verbs/r.html"
content = urllib2.urlopen(url).read()
soup = BeautifulSoup(content)
print soup.prettify()

The Error thrown was something like this:

UnicodeEncodeError: 'charmap' codec can't encode character u '\xa9' in position 57801: character maps to undefined

Can someone please tell me what the error is and how to fix and proceed?


Solution

  • It would be much easier if you showed us the whole stack trace or, at least, at which line it points.

    Anyway, I bet, the problem is with the last line. Change it to:

    print(soup.prettify().encode('utf-8'))