HTML unescape does not work with BeautifulSoup replace_with

I am trying to edit the inner HTML of some elements in Python using BeautifulSoup. Here is a simple example:

from bs4 import BeautifulSoup
import html

html_str = '<div><span><strong>Hello world</strong></span></div>'
soup = BeautifulSoup(html_str, 'html.parser')
span = soup.select_one('span')
span.replace_with('message: ' + html.unescape(span.decode_contents()) + ', end of message')

print(soup)

I was expecting to get a decoded string, like: <div>message: <strong>Hello world</strong>, end of message</div>

But instead I got: <div>message: <strong>Hello world</strong>, end of message</div>

Notice that this behaviour only happens when the target element contains a child, e.g. if you try to execute this code on the strong element (with soup.select_one('strong')), it works as expected.

Solution

The easiest way is to use .replace_with with new BeautifulSoup object, e.g.:

from bs4 import BeautifulSoup

html_str = "<div><span><strong>Hello world</strong></span></div>"
soup = BeautifulSoup(html_str, "html.parser")

span = soup.select_one("span")
span.replace_with(BeautifulSoup(f"message: {str(span)}, end of message", "html.parser"))

print(soup)

Prints:

<div>message: <span><strong>Hello world</strong></span>, end of message</div>