Search code examples
xmlxmlstarlet

How to merge two xml files using xmlstarlet


I have two example xml files and I want to merge the elements inside them.

If I run xmlstarlet sel -t -c "//data" input1.xml input2.xml, I have two time the data tag (I know this is the correct result), but I would like to have the merging only of the item tags, and only one tag.

This is my input files

input1.xml:

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<demo:pub xsi:schemaLocation="demo_1_0 demo.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:demo="demo_1_0">
  <data>
    <item>
      <fieldA>12</fieldA>
      <fieldB>Hello world</fieldB>
    </item>
    <item>
      <fieldA>15</fieldA>
      <fieldB>The book is yellow</fieldB>
    </item>
  </data>
</demo:pub>

input2.xml:

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<demo:pub xsi:schemaLocation="demo_1_0 demo.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:demo="demo_1_0">
  <data>
    <item>
      <fieldA>08</fieldA>
      <fieldB>Hello world II</fieldB>
    </item>
    <item>
      <fieldA>06</fieldA>
      <fieldB>The book is orange</fieldB>
    </item>
  </data>
</demo:pub>

And I would like to have something like:

<data xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:demo="demo_1_0">
    <item>
      <fieldA>12</fieldA>
      <fieldB>Hello world</fieldB>
    </item>
    <item>
      <fieldA>15</fieldA>
      <fieldB>The book is yellow</fieldB>
    </item>
    <item>
      <fieldA>08</fieldA>
      <fieldB>Hello world II</fieldB>
    </item>
    <item>
      <fieldA>06</fieldA>
      <fieldB>The book is orange</fieldB>
    </item>
</data>

Solution

  • Combine two XML files with xmlstarlet and bash:

    echo "<data/>" | xmlstarlet edit \
      --insert '//data' --type attr -n 'xmlns:xsi' --value 'http://www.w3.org/2001/XMLSchema-instance' \
      --insert '//data' --type attr -n 'xmlns:demo' --value 'demo_1_0' \
      --subnode '//data' --type text -n '' --value "$(xmlstarlet select --omit-decl -t --copy-of '//data/item' input1.xml)" \
      --subnode '//data' --type text -n '' --value "$(xmlstarlet select --omit-decl -t --copy-of '//data/item' input2.xml)" \
      | xmlstarlet unescape \
      | xmlstarlet format --omit-decl --nsclean
    

    Output:

    <data xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:demo="demo_1_0">
      <item>
        <fieldA>12</fieldA>
        <fieldB>Hello world</fieldB>
      </item>
      <item>
        <fieldA>15</fieldA>
        <fieldB>The book is yellow</fieldB>
      </item>
      <item>
        <fieldA>08</fieldA>
        <fieldB>Hello world II</fieldB>
      </item>
      <item>
        <fieldA>06</fieldA>
        <fieldB>The book is orange</fieldB>
      </item>
    </data>
    

    See: xmlstarlet --help, xmlstarlet edit --help, xmlstarlet format --help