Search code examples
xmlbashhashxml-parsingxmlstarlet

xmlstarlet: how to select items to build a hash


I would like to extract (using xmlstarlet) from the following XML file (list.xml)

<?xml version="1.0" encoding="UTF-8"?>
<reports>
  <report>
    <name>b486f8d9</name>
    <readableName>Scan1</readableName>
    <timestamp>1375757990</timestamp>
  </report>
  <report>
    <name>5f01bd96</name>
    <readableName>Scan2</readableName>
    <timestamp>1367342696</timestamp>
  </report>
</reports>

the value of readableName for a given name. In the example above this would be Scan1 for a query on b486f8d9.

I found a great answer on a very similar problem but the query is on another type of elements, then tried

xmlstarlet sel -t -c "/reports/report[name=b486f8d9]" list.xml

but this did not work (empty output)

Could you please help me to construct the appropriate query for my case? Since I ultimately want to build a hash in bash (the key being name and values readableName and timestamp) maybe there is a more clever way to do that instead of parsing the file the way I intend to (= first get the list of names, then query the values for each of them)?

Thanks!


Solution

  • The comparison:

    name=b486f8d9
    

    Compares the value of the tag name with the value of the element b486f8d9. Since there is no element b486f8d9, that's not going to work. What you wanted was to compare the element name with the string 'b486f8d9':

    xmlstarlet sel -t -c "/reports/report[name='b486f8d9']"
    

    But that's going to get you a chunk of XML (since it's a -*c*opy of the selected element). What you want is the string -*v*alue of the readableName element:

    xmlstarlet sel -t -v "/reports/report[name='b486f8d9']/readableName"
    

    which will print

    Scan1
    

    So that's how you do a lookup. But I believe you want to do a full report of all of the names. You can create pretty well any format you like; here's one example (note the use of -*m*atch to match all /reports/report elements.)

    $ xmlstarlet sel -t -m "/reports/report" \
                     -v name -o ' ' -v readableName -o ':' -v timestamp -n list.xml
    b486f8d9 Scan1:1375757990
    5f01bd96 Scan2:1367342696