I'm trying to get get HTML-Code from inside an XML File and all i get are the single elements.
XML-Example:
<?xml version="1.0" encoding="ISO-8859-1"?>
<websites>
<website name="1">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
</website>
</websites>
I need a string containing only the html like this
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
You can use beautifulsoup:
from bs4 import BeautifulSoup
example = """
<?xml version="1.0" encoding="ISO-8859-1"?>
<websites>
<website name="1">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title/>
</head><body>Sample Content.....</body>
</html>
</website>
</websites>
"""
soup = BeautifulSoup(example)
html = soup.find('html')
print(html)
Output:
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head><body>Sample Content.....</body>
</html>