Search code examples
bashxmlstarlet

bash+xmlstarlet: How can one index into a list, or populate an array?


I'm trying to select a single node using xmlstarlet from the following example XML:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="key.xsl" ?>
<tables>
  <tableset>
    <table name="table1">
      <row>
        <fld name="fileName">
          <strval><![CDATA[/my/XYZ/file1]]></strval>
        </fld>
        <fld name="fileName">
          <strval><![CDATA[/my/XYZ/file2]]></strval>
        </fld>
        <fld name="fileName">
          <strval><![CDATA[/my/other/XYZ/file3]]></strval>
        </fld>
        <fld name="worksBecauseUnique">
          <strval><![CDATA[/XYZ/unique]]></strval>
        </fld>
      </row>
    </table>
  </tableset>
</tables>

I'm trying to build an associative array in bash... How can I select a single node, or iterate over multiple nodes using xmlstarlet?

I'm trying something like the following so far which is not working:

xmlstarlet sel -t -v "//tables/tableset/table/row/fld[@name=\"fileName\"]/strval[0]" xmlfile.xml

Hoping to get "/my/XYZ/file1" however this is not working.


Solution

  • Answering the first part of your question, there's a simple mistake you're making:

    strval[0]
    

    needs to be

    strval[1]
    

    ...to select the first instance, as XPath arrays are 1-indexed, not 0-indexed.


    Now, when you want to select the second match inside your whole document, not inside the parent fld, that looks a bit different:

    (//tables/tableset/table/row/fld[@name="fileName"]/strval)[2]
    

    Now on to populating a shell array. Since your content here doesn't contain newlines:

    query='//tables/tableset/table/row/fld[@name="fileName"]/strval'
    
    fileNames=( )
    while IFS= read -r entry; do
      fileNames+=( "$entry" )
    done < <(xmlstarlet sel -t -v "$query" -n xmlfile.xml)
    
    # print results
    printf 'Extracted filename: %q\n' "${fileNames[@]}"
    

    You aren't giving enough detail to set up an associative array (how do you want to establish the keys?), so I'm doing this as a simple indexed one.


    On the other hand, if we were to make some assumptions -- that you wanted to set up your associative array to match from the @name key to the strval value, and that you wanted to use newlines to separate multiple values when given for the same key -- then that might look like this:

    match='//tables/tableset/table/row/fld[@name][strval]'
    key_query='./@name'
    value_query='./strval'
    
    declare -A content=( )
    while IFS= read -r key && IFS= read -r value; do
      if [[ $content[$key] ]]; then
        # appending to existing value
        content[$key]+=$'\n'"$value"
      else
        # first value for this key
        content[$key]="$value"
      fi
      fileNames+=( "$entry" )
    done < <(xmlstarlet sel \
               -t -m "$query" \
               -v "$key_query" -n \
               -v "$value_query" -n xmlfile.xml)