Search code examples
xmlbashmethodsxml-parsingxmlstarlet

Parsing an XML file and adding a new key based on an existing key


I'm really looking for some advice on the best approach to tackling this in bash.

I have a XML file with 1000's of entries that looks like this:

<?xml version="1.0"?>
<myList>
    <dataitem>
        <path>./5553 Subset 43d.zip</path>
        <name>5553 Subset 43d</name>
    </dataitem>
    <dataitem>
        <path>./Another file name here with spaces.zip</path>
        <name>Another file name here with spaces</name>
    </dataitem>
...

And I'd like to add an additional key to each <dataitem> using the <name> key's data with an mp4 extension, so it would look like this:

<?xml version="1.0"?>
<myList>
    <dataitem>
        <path>./5553 Subset 43d.zip</path>
        <name>5553 Subset 43d</name>
        <video>5553 Subset 43d.mp4</video>
    </dataitem>
    <dataitem>
        <path>./Another file name here with spaces.zip</path>
        <name>Another file name here with spaces</name>
        <video>Another file name here with spaces.mp4</video>
    </dataitem>
...

Solution

  • The right way with xmlstarlet tool:

    xmlstarlet ed -s "//dataitem" -t elem -n video input.xml \
    | xmlstarlet ed -u "//dataitem/video" -x "concat(./preceding-sibling::name/text(), '.mp4')"
    

    The output should be as:

    <?xml version="1.0"?>
    <myList>
      <dataitem>
        <path>./5553 Subset 43d.zip</path>
        <name>5553 Subset 43d</name>
        <video>5553 Subset 43d.mp4</video>
      </dataitem>
      <dataitem>
        <path>./Another file name here with spaces.zip</path>
        <name>Another file name here with spaces</name>
        <video>Another file name here with spaces.mp4</video>
      </dataitem>
    ...