Search code examples
htmlpython-3.xvalidationpyramid

Pyramid: Removing HTML Tags From Form Input


I have a Pyramid application using a <form> in my template.

I want to remove all HTML tags that a user attempts to submit in <input> tag.

how can I do that? (I want a secure way to do that as JavaScript runs on the client side)

I also request for a simple example if possible.

here is my example:

<form method="GET" action="submit">
    <input type="text" name="username"/>
</form>

and in my pyramid view I have:

@view_config(name='submit', renderer='templates/submit.jinja2')
def submit(request):
    var = request.params['username']
    return {'input': var}

here if the user try to input <a href="google.com"> John Doe </a> the var is also going to be <a href="google.com"> John Doe </a>, however, I only need John Doe


Solution

  • My default answer to this is that you probably shouldn't actually be removing those tags, but encoding them properly when they are written to the client. If someone wants their username to be Lyndsy <b>Simon</b>, that should be fine. Escaping it on output means that you don't have to worry about doing so on each input, and don't have a potential code injection vector if you don't properly escape an input somewhere or an attacker is able to bypass it.

    That said, if removing the HTML tags on input is definitely the path you want to take, I have used bleach in the past and recommend it. My use case has been when I needed to accept some HTML tags (bold, italic, etc.), but strip others. Bleach allows you to set a whitelist of allowable tags to fit this need.

    Note that you can still use bleach to strip the tags on output, instead of input, if that's the way you decide to go.