Search code examples
rubyxmlxhtmlnokogiriconditional-comments

Tweak IE Conditional Comment with Nokogiri without converting entities


I have an XHTML file with an HTML5 Shiv in the head:

<!--[if lt IE 9]>
  <script src='../common/html5.js' type='text/javascript'></script>
<![endif]-->

Using Nokogiri I need to adjust the path in that comment, stripping off the ../. However, any changes to the .content of the comment node results in XML output that converts the > and < to entities:

XML = <<ENDXML
<r><!--[if lt IE 9]>
  <script src='../common/html5.js' type='text/javascript'></script>
<![endif]--></r>
ENDXML

require 'nokogiri'
doc = Nokogiri.XML(XML)
comment = doc.at_xpath('//comment()')
comment.content = comment.content.sub('../','')
puts comment.to_xml
#=> <!--[if lt IE 9]&gt;
#=>   &lt;script src='common/html5.js' type='text/javascript'&gt;&lt;/script&gt;
#=> &lt;![endif]-->

The original source is valid XML/XHTML. How can I get Nokogiri not to convert the entities inside this comment during tweaking?


Solution

  • The Nokogiri docs for content= say:

    The string gets XML escaped, not interpreted as markup.

    So rather than using that, you could replace the node with a new one, using replace and an explicitly created comment node:

    XML = <<ENDXML
    <r><!--[if lt IE 9]>
      <script src='../common/html5.js' type='text/javascript'></script>
    <![endif]--></r>
    ENDXML
    
    require 'nokogiri'
    doc = Nokogiri.XML(XML)
    comment = doc.at_xpath('//comment()')
    
    # this line is the new one, replacing comment.content= ...
    comment.replace Nokogiri::XML::Comment.new(doc, comment.content.sub('../',''))
    
    # note `comment` is the old comment, so to see the changes
    # look at the whole document
    puts doc.to_xml
    

    Output is:

    <?xml version="1.0"?>
    <r>
      <!--[if lt IE 9]>
      <script src='common/html5.js' type='text/javascript'></script>
    <![endif]-->
    </r>