Search code examples
xmlxsltxinclude

Create list of image paths in XML documents


I have xml document which xinclude other xml files. All of these xml files contain relative paths for images which are in different source locations.

<chapter xml:id="chapter1">
    <title>First chapter in Main Document</title>
    <section xml:id="section1">
        <title>Section 1 in Main Document</title>
        <para>this is paragraph<figure>
                <title>Car images</title>
                <mediaobject>
                    <imageobject>
                        <imagedata fileref="images/image1.jpg"/>
                    </imageobject>
                </mediaobject>
            </figure></para>
    </section>
    <xi:include href="../doc/section2.xml"/>
    <xi:include href="../doc/section3.xml"/>
</chapter>

Here is section2 and section3 xml documents will look like.

<section xml:id="section2"  
        <title>Main Documentation Section2</title>
        <para>This is also paragraph <figure>
                <title>Different Images</title>
                <mediaobject>
                    <imageobject>
                        <imagedata fileref="images/image2.jpg"/>
                    </imageobject>
                </mediaobject>
            </figure></para>
    </section>

I want to create XSLT 1.0 style sheet which will generate a list of image paths in all xml documents. I am going to copy those images which are in different source locations into single image folder. Then I will be able to use that list of image paths to copy those images. And it would be great, if that image paths list saved in a structure which can access by java class.

Currently I am using XSLT which I get from another question. But this XSLT gives other node's values together with image paths. I tried lot filter them by changing template values.

<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>

<xsl:template match="xi:include[@href][@parse='xml' or not(@parse)]">
<xsl:apply-templates select="document(@href)" />
</xsl:template>

Expected result list would be some thing like,

/home/vish/test/images/image1.jpg

/home/vish/test/doc/other/images/image2.jpg

/home/vish/test2/other/images/image3.jpg

Thanks in advance..!!


Solution

  • How about ...

    <?xml version="1.0" encoding="utf-8"?>
    <xsl:stylesheet version="1.0"
          xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
          xmlns:xi="http://www.w3.org/2001/XInclude"
          exclude-result-prefixes="xsl xi">
    <xsl:output method="xml" indent="yes"/>
    <xsl:strip-space elements="*" />
    
    <xsl:template match="/">
     <image-paths>
      <xsl:apply-templates select="*" />
     </image-paths>
    </xsl:template>
    
    <xsl:template match="*">
     <xsl:apply-templates select="*" />
    </xsl:template>
    
    <xsl:template match="imagedata">
     <imagedata fileref="{@fileref}" />
    </xsl:template>
    
    <xsl:template match="xi:include[@href][@parse='xml' or not(@parse)]">
    <xsl:apply-templates select="document(@href)" />
    </xsl:template>
    
    </xsl:stylesheet>
    

    You should get output like ...

    <image-paths>
     <imagedata fileref="path1/image1.jpg" />
     <imagedata fileref="path2/image2.jpg" />
     <imagedata fileref="path3/image3.jpg" />
    </image-paths>