Search code examples
xmlbashxml-parsingxmllint

Bash rename files based on XML


I am in the process of making a bash script and I need to rename all files of a particular file type (in this case svg) in order to reflect the order in which they are mentioned in an xml file.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><manifest identifier="id1" version="2006-01" smartnotebook:filesource="SMART Notebook for Mac Version=11.3.804.0" xmlns="http://www.imsglobal.org/xsd/imscp_v1p1" xmlns:adlcp="http://www.adlnet.org/xsd/adlcp_v1p3" xmlns:smartnotebook="http://www.smarttech.com/2006-01/notebook" xmlns:smartgallery="http://www.smarttech.com/2006-01/gallery"><metadata><schema>ADL SCORM</schema><schemaversion>CAM 1.3</schemaversion><adlcp:location>metadata.xml</adlcp:location></metadata><organizations><organization id="pagegroups"><item id="group0" identifierref="group0_pages"><title>Group 1</title></item></organization></organizations><resources><resource identifier="group0_pages" href="page4.svg" type="webcontent" adlcp:scormType="asset">*******<file href="page4.svg"/><file href="page0.svg"/><file href="page1.svg"/><file href="page3.svg"/><file href="page2.svg"/>********</resource><resource identifier="pages" href="page4.svg" type="webcontent" adlcp:scormType="asset"><file href="page4.svg"/><file href="page0.svg"/><file href="page1.svg"/><file href="page3.svg"/><file href="page2.svg"/></resource><resource identifier="images"/><resource identifier="sounds"/><resource identifier="attachments"/><resource identifier="flash"/><resource identifier="videos"/><resource identifier="annotationmetadata"/><resource identifier="brush"/></resources></manifest>

I need it to read this file(and other files in the same format) and (for example) rename page4.svg to file0.svg ... and page0.svg to file1.svg. I have been investigating how to do this via xmllint but my xpath knowledge is very limited. Anything is helpfull! thanks!


Solution

  • The xpath query //resource/file/@href will give you the list of hrefs. But there will be duplicates based on that xml sample. If you just want the list from the first resource you could use //resource[1]/file/@href or specify by the identifier as //resource[@identifier="group0_pages"]/file/@href.

    However, for some reason I was unable to get xmllint to accept those queries even though they are correct and they work in other tools. Instead, in Linux, I use the xpath utility which is a perl script from the libxml-xpath-perl package. Along with some simple parsing, an example bash script would look like:

    i=0
    while read filename; do
            mv $filename file$i.svg
            let i++
    done < <(xpath -q -e '//resource[@identifier="group0_pages"]/file/@href' input.xml | cut -d\" -f2)