Search code examples
pythondjangohtml-parsing

Django: Parse HTML (containing form) to dictionary


I create a html form on the server side.

<form action="." method="POST">
 <input type="text" name="foo" value="bar">
 <textarea name="area">long text</textarea>
 <select name="your-choice">
  <option value="a" selected>A</option>
  <option value="b">B</option>
 </select>
</form>

Desired result:

{
 "foo": "bar",
 "area": "long text",
 "your-choice": "a",
}

The method (parse_form()) I am looking for could be used like this:

response = client.get('/foo/')

# response contains <form> ...</form>

data = parse_form(response.content)

data['my-input']='bar'

response = client.post('/foo/', data)

How to implement parse_form() in Python?

This is not related to Django, nevertheless, there is an feature request in Django, but it was rejected several years ago: https://code.djangoproject.com/ticket/11797

Update

I wrote a small Python library around the the lxml based answer: html_form_to_dict


Solution

  • Why not just this?:

    def parse_form(content):
        import lxml.html
        tree = lxml.html.fromstring(content)
        return dict(tree.forms[0].fields)
    

    I couldn't guess the reason for using a UserDict

    One little caveat: I noticed that when the form contains a <select>, the first value is returned when no option is selected; the solution I gave above based on BS returns None instead