I'm using bleach to sanitize user input. But I use Markdown which means I need the blockquote > symbol to go through without being escaped as & gt; so I can pass it to misaka for rendering.
The documentation says by default it escapes html markup but doesn't say how to turn that off for the > symbol. I would still like it to escape actual html tags.
http://bleach.readthedocs.org/en/latest/clean.html
Any other ideas for sanitizing input while maintaing the ability to use Markdown would be appreciated.
Do you need strip all tags, but leave > as it is?
Simple way for step 2:
output.replace('>', '>')
More professional
import HTMLParser
h = HTMLParser.HTMLParser()
s = h.unescape(sanitized user input)