Search code examples
pythonxmlbeautifulsoup

Formatting for xml tag into a soap request using BeautifulSoup with Python


I'm trying to add a tag into an xml file. When the tag is stored as a variable I noticed that the < /> characters were decoded to &lt; &gt; when I tried to append() it to the xml.

import html
from bs4 import BeautifulSoup

no_email = '<errorEmailNotification enabled="false"/>'
bs = BeautifulSoup(xml, "xml")
header = bs.find("S:Header")
header.append(no_email)
print(header.prettify())

which results in this output

<S:Header>
 <Some Header info/>
 &lt;errorEmailNotification enabled="false"/&gt;
</S:Header>

I used html.unescape() on the below variable and it looks fine when printed out

no_email = html.unescape('&lt;errorEmailNotification enabled="false"/&gt;')
print(no_email)

<errorEmailNotification enabled="false"/>

but when I pass no_email into a specific tag using beautifulsoup it still looks exactly as it is written in the string and I'm not sure why.

from bs4 import BeautifulSoup
no_email = html.unescape('&lt;errorEmailNotification enabled="false"/&gt;')
bs = BeautifulSoup(xml, "xml")
header = bs.find("S:Header")
header.append(no_email)
print(header.prettify())

output:

<S:Header>
 <Some Header Info/>
 &lt;errorEmailNotification enabled="false"/&gt;
</S:Header>

How can I use this correctly so that I get this result:

    <S:Header>
     <Some Header Info/>
     <errorEmailNotification enabled="false"/>
    </S:Header>

Solution

  • Use a new_tag() element instead of a string:

    import bs4
    from io import StringIO
    
    xml = """<root xmlns:S ="something">
    <S:Header>
     <Some_Header_info/>
    </S:Header></root>"""
    f = StringIO(xml)
    
    soup = bs4.BeautifulSoup(f.read(), "xml")
    no_email = soup.new_tag("errorEmailNotification", enabled="false")
    header = soup.find("S:Header")
    header.append(no_email)
    print(soup.prettify())
    

    Output:

    <?xml version="1.0" encoding="utf-8"?>
    <root xmlns:S="something">
     <S:Header>
      <Some_Header_info/>
      <errorEmailNotification enabled="false"/>
     </S:Header>
    </root>