Search code examples
xml-simple

ruby gem xml-simple: same output for different input


I'm struggling with the xml-simple (1.1.5) gem. This is my input in test.xml:

<bib>
<title><br/>X</title>
<title>X<br/>X</title>
<title>X<br/></title>
</bib>

Now see what happens using irb as follows:

$ irb -rxmlsimple -rpp
>> pp XmlSimple.xml_in("test.xml")
{"title"=>
  [{"br"=>[{}], "content"=>"X"},
   {"br"=>[{}], "content"=>["X", "X"]},
   {"br"=>[{}], "content"=>"X"}]}
=> {"title"=>[{"br"=>[{}], "content"=>"X"}, {"br"=>[{}], "content"=>["X", "X"]}, {"br"=
>>

So apparently the first and last records, though different, give the same hashes in the output.
Is this a bug?


Solution

  • The xml-simple gem does not work reliably with mixed content. Here's an extract from its documentation:

    Mixed content (elements which contain both text content and nested elements) will be not be represented in a useful way - element order and significant whitespace will be lost. If you need to work with mixed content, then XmlSimple is not the right tool for your job.