I run a bs4 program on Python27, it works faultless, I am having a problem once I used Python3. I am using updated version of bs4 for both. The file I am running this on is html and I noticed the error is on a tag. Is there a supporting module I need to update? like lxml?
Code:
from bs4 import BeautifulSoup
data = open(directory +'\\'+ file)
soup = BeautifulSoup(data, 'html.parser')
Here is the error:
...
File "C:\Anaconda3\lib\html\parser.py", line 174, in error
raise HTMLParseError(message, self.getpos())
html.parser.HTMLParseError: unknown status keyword 'NKXE' in marked section,
at line 318, column 49
Always appreciate the help!
See if installing html5lib
pip install html5lib
And then making the request like this fixes the issue.
from bs4 import BeautifulSoup
data = open(directory +'\\'+ file)
soup = BeautifulSoup(data, 'html5lib')
This has worked for me.