Search code examples
xsltxslt-2.0

How to get absolute document path of a file from collection


Please suggest to get the absolute path of each documents which are collected thru xslt collection.

Posted script is able to give the required absolute path, but I have used two collections (it may take unnecessary memory to store info of all articles twice, one collection for collecting info and other one to collect document-uri()s).

XMLs:

D:/DocumentPath/Project-01/2016/ABC/Test.xml

<article>
  <title>First article</title>
  <tag1>The tag 1</tag1>
  <tag3>The tag 3</tag3>
</article>

D:/DocumentPath/Project-01/2016/DEF/Test.xml

<article>
  <title>Second article</title>
  <tag2>The tag 2</tag2>
  <tag3>The tag 3</tag3>
</article>

and other XMLs....

XSLT 2.0:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>

<xsl:variable name="varDocuments">
    <xsl:copy-of select="collection('file:///D:/DocumentPath/Project-01/2016/?select=*.xml;recurse=yes')
        [matches(document-uri(.), '2016/([A-z]+)/.*?.xml')]"/>
</xsl:variable>

<xsl:variable name="varDocuments1">
    <xsl:copy-of select="collection('file:///D:/DocumentPath/Project-01/2016/?select=*.xml;recurse=yes')
        [matches(document-uri(.), '2016/([A-z]+)/.*?.xml')]/document-uri(.)"/>
</xsl:variable>

<xsl:template match="@*|node()">
    <xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy>
</xsl:template>

<xsl:template match="/">
    <Table border="1">
        <TR><TH>Position</TH><TH>Title</TH><TH>Tag1</TH><TH>Tag2</TH><TH>Tag3</TH><TH>Tag4</TH><TH>Path</TH></TR>
        <xsl:for-each  select="$varDocuments">
            <xsl:for-each select="article">
                <TR>
                    <xsl:variable name="varPos" select="position()"/>
                    <td><xsl:value-of select="position()"/></td>
                    <td><xsl:value-of select="title"/></td>
                    <td><xsl:value-of select="count(descendant::tag1)"/></td>
                    <td><xsl:value-of select="count(descendant::tag2)"/></td>
                    <td><xsl:value-of select="count(descendant::tag3)"/></td>
                    <td><xsl:value-of select="count(descendant::tag4)"/></td>
                    <td><xsl:value-of select="normalize-space(tokenize($varDocuments1, 'file:/')[position()=$varPos + 1])"/></td>
                </TR>
            </xsl:for-each>
        </xsl:for-each>
    </Table>
</xsl:template>

</xsl:stylesheet>

Required result:

<Table border="1">
   <TR>
      <TH>Position</TH>
      <TH>Title</TH>
      <TH>Tag1</TH>
      <TH>Tag2</TH>
      <TH>Tag3</TH>
      <TH>Tag4</TH>
      <TH>Path</TH>
   </TR>
   <TR>
      <td>1</td>
      <td>First article</td>
      <td>1</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>D:/DocumentPath/Project-01/2016/ABC/Test.xml</td>
   </TR>
   <TR>
      <td>2</td>
      <td>Second article</td>
      <td>0</td>
      <td>1</td>
      <td>1</td>
      <td>0</td>
      <td>D:/DocumentPath/Project-01/2016/DEF/Test.xml</td>
   </TR>
   <TR>
      <td>3</td>
      <td>Third article</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>2</td>
      <td>D:/DocumentPath/Project-01/2016/GHI/Test.xml</td>
   </TR>
</Table>


Solution

  • I would first suggest to change

    <xsl:variable name="varDocuments">
        <xsl:copy-of select="collection('file:///D:/DocumentPath/Project-01/2016/?select=*.xml;recurse=yes')
            [matches(document-uri(.), '2016/([A-z]+)/.*?.xml')]"/>
    </xsl:variable>
    

    to at least

    <xsl:variable name="varDocuments" select="collection('file:///D:/DocumentPath/Project-01/2016/?select=*.xml;recurse=yes')
            [matches(document-uri(.), '2016/([A-z]+)/.*?.xml')]"/>
    

    as there does not seem to be a need to pull in the documents with collection and then create an additional copy with copy-of.

    With that correction, when you process each document with with <xsl:for-each select="$varDocuments">, you can simply there read out the document-uri(.) now, as you are processing the documents pulled in and not any copy assembled.