Search code examples
htmlruby-on-railsxhtmltagswysiwyg

Forcing tags to close in dynamic text


In many places on many of my sites, users are permitted to enter formatted text through a WYSIWYG or through plain text with tags. Naturally, such input is sanitized for security threats, but it is not stripped of tags nor is it fully entity encoded. Something like <p>hello world</p> ends up going back to the end user as <p>hello world</p>.

Most WYSIWYGs are smart enough to clean up the XML before turning the content over to the form, but manual POST requests, non-WYSIWYG text areas, and non-JS users will not be subject to this pull in the right direction. So there's nothing to stop a user from inputting <a href="/">, turning the rest of the page into a link.

What's the best way to treat this?


Solution

  • Whatever the user supplies, parse it using an HTML parser. Sanitize it while it's a DOM, then serialize the DOM back to HTML taking the contents of the body element (the parser will create one if necessary) as the string sent back to the end user. All necessary elements will have closing tags in place.