Search code examples
rubyxmlnokogiricdata

How can i put a string with an ampersand in an xml file with Nokogiri?


I'm trying to include a URL to an image in an XML file, and the ampersands in the URL query string are getting stripped out:

bgdoc.xpath('//Master').each do |elem|
  part = elem.xpath('Part').inner_text
  image = imagehash[part]
  image = "" if image.blank?
  elem.xpath('Image').first.content = "<![CDATA[#{image}]]>"
  puts elem.xpath('Image').first.content
end

bgdoc is getting written out with the help of Builder later on. But not the individual elements, it's getting inserted all at once. That makes it a different case than a similar question posted on SO.


Solution

  • You should be using create_cdata to create a CDATA node and then add_child to add it to the document, just assigning a string to content will leave you with &lt;!CDATA... in your XML and that's not very useful.

    A short example should illustrate the process:

    xml   = '<Master><Image></Image><Image></Image></Master>'
    bgdoc = Nokogiri::XML(xml)
    cdata = bgdoc.create_cdata('/where?is=pan&cakes=house')
    bgdoc.xpath('//Image').first.add_child(cdata)
    

    Then, if you bgdoc.to_xml you'll get something like this:

    <?xml version="1.0"?>
    <Master>
        <Image><![CDATA[/where?is=pan&cakes=house]]></Image>
        <Image/>
    </Master>
    

    I think that's the sort of result you're looking for. However, if you just assign a string to content:

    bgdoc.xpath('//Image').first.content = '<![CDATA[/where?is=pan&cakes=house]]>'
    

    Then you get this XML:

    <?xml version="1.0"?>
    <Master>
        <Image>&lt;![CDATA[/where?is=pan&amp;cakes=house]]&gt;</Image>
        <Image/>
    </Master>
    

    and that doesn't even have a CDATA node.