As a method for computing similarity between XML documents (usually several but in this case, two ones), tag-based similarity computation has several applications. Now, how to implement such a method using XSLT.
I think it in this way: Extract tags and list them for both documents. Next, check for exact/partial matching between two lists.
In this regard, does XSLT provide any function/operation for comparing strings (tags). Any idea on the concept and implementation is welcomed.
Simple Example:
For these XML docs (portion of them, of course),
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
and this one,
<books>
<authorname>Ralls, Kim</authorname>
<booktitle>Midnight Rain</booktitle>
<genre>Fantasy</genre>
<cost>5.95</cost>
<date>2000-12-16</date>
<abstract>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</abstract>
</books>
Both docs have six elements (tags), among them genre appeared in both, title is similar to booktitle, author with authorname and publish_date with date. So, these two are similar. (1 exact matching, 3 partial matching)
Assuming XSLT 2.0 the following takes the first XML document as its input and the second document's URL as a parameter and then outputs for each element name in the first document a list of names that are contained or contain the name in the second:
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:output method="text"/>
<xsl:param name="doc2-url" as="xs:string" select="'test2015012102.xml'"/>
<xsl:variable name="doc2" as="document-node()" select="doc($doc2-url)"/>
<xsl:variable name="doc2-names" as="xs:string*" select="distinct-values($doc2//*/local-name())"/>
<xsl:template match="/">
<xsl:value-of select="for $name in distinct-values(//*/local-name())
return concat($name, ': ', string-join($doc2-names[contains($name, .) or contains(., $name)], ', '))"
separator=" "/>
</xsl:template>
</xsl:stylesheet>
So for your sample the output is
book: books, booktitle
author: authorname
title: booktitle
genre: genre
price:
publish_date: date
description: