Search code examples
ruby-on-railsruby-on-rails-4sanitizationsanitize

Rails HTML sanitizer (Loofah) adds unwanted line breaks between elements


Rails's built-in HTML sanitizer (which uses the gem Loofah) adds newlines between <ul> and <li> tags. I want to display the sanitized content with white-space: pre-wrap; because it comes from a WYSIWYG editor, but the extra newlines make the output look wrong. Desired on top, actual on bottom, with background color added to ul for emphasis:

Desired versus actual output of html sanitizing

Here's what happens when I run some code through the sanitize in the rails console:

2.2.2 :033 > input = "<ul><li>a</li><li>b</li></ul>"
 => "<ul><li>a</li><li>b</li></ul>"
2.2.2 :034 > WhiteListSanitizer.new.sanitize(input)
 => "<ul>\n<li>a</li>\n<li>b</li>\n</ul>"

And if I make a Loofah object and convert it to html without scrubbing, it still adds newlines.

2.2.2 :035 > Loofah.fragment(input).to_html
 => "<ul>\n<li>a</li>\n<li>b</li>\n</ul>"

How do I make it leave the whitespace alone?

I can strip out the line breaks with regex if absolutely necessary but it seems strange that there's no option to disable this behavior.


Solution

  • You need to exclude FORMAT option from DEFAULT_HTML option (both belong to Nokogiri):

    input = "<ul><li>a</li><li>b</li></ul>"
    disable_formatting = Nokogiri::XML::Node::SaveOptions::DEFAULT_HTML ^ Nokogiri::XML::Node::SaveOptions::FORMAT
    Loofah.fragment(input).to_html(save_with: disable_formatting)
    # => "<ul><li>a</li><li>b</li></ul>"