Search code examples
pythonpandasbeautifulsoup

Inserting DOM element to a HTML changing charecter "<" to html/xml character "&lt;" in Python using Pandas


I want to edit a html file and make a column editable in a table. I am using pandas and BeautifulSoup in python.

Code Snippet:

import pandas as pd
from bs4 import BeautifulSoup


with open("../templates/output/alternateSearchResponse.html") as fp:
    soup = BeautifulSoup(fp)
    data_frame = pd.read_html(str(soup.findAll('table')[4]))[0]

    allTextFields = data_frame.get('quantity').apply(lambda x: f'''<input type="text" value="{x}">''')

    data_frame.at[0, 'quantity'] = allTextFields.iloc[0]
    data_frame.at[1, 'quantity'] = allTextFields.iloc[1]

    htmlOutput = data_frame.to_html()
    print(htmlOutput)
    f = open("../templates/output/test.html", "w")
    f.write(htmlOutput)
    f.close()

While i check the test.html file in a text editor i found the <input> dom object is created as

&lt;input type="text"&gt;0&lt;/input&gt; inside html.

Below is the source HTML image. The marked table in the image is what I want to edit.

Source HTML Image

This is what i get as a result html after i run the script. Result HTML Image

Result I Expect as HTML after i run the script. Here the quantity filed is editable Expected HTML Image

The source HTML i am trying to edit don't have any CSS, Javascript, or any ID to any fields. It only has plain <table> DOMs.

Questions:

  1. How can i solve this probelm?
  2. Is there any other workaround/library in Python using which i can add/edit HTML DOM as and when required?

Just FYI, Actually the source HTML is generated from a JSON output using json2html library in python. My next step would be adding a submit button to the same HTML and execute FORM submit.

The expectation is already mentioned above.


Solution

  • Documentation for to_html()

    escape : bool, default True
    
        Convert the characters <, >, and & to HTML-safe sequences.
    

    So you need

    htmlOutput = data_frame.to_html(escape=False)
    

    Or maybe even

    data_frame.to_html("../templates/output/test.html", escape=False)