Search code examples
xmlxsltxpathconvertersgraphml

XSLT - cant copy elements only once in complexed graph representation


I have XML data (GraphML) i need to transform for my application. The XML represents a graph, that has nodes of labels "User" and "Item", and edges of label "HAS_HOBBY" and "FRIEND_OF".

Given a specific user, I want to get after the transform all his friends that share at least one hobby with him, and those hobbies (represented by items). "friends" are represented by "FRIEND_OF" edge element, and hobbies by "HAS_HOBBY".

I have my XSLT (i'm kinda new at this) that can find the items needed and the friends, however in my logic i cant manage to copy a friend just once - it is done once for every hobby he shares with the original user. I do this by going over each of the friend's hobbies for each of the user's hobbies, and when there's a match - i print the item (hobby) (which is okay), and the friend - however this friend is printed every time a match is found, resulting in multiple occurrences of this friend, which is undesired.

I tried searching for ways to avoid this, but i think my entire logic is flawed implementing this solution. I have no other ideas, though.

Here's my XSL:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:ns="http://graphml.graphdrawing.org/xmlns"
    xmlns="http://graphml.graphdrawing.org/xmlns"
    exclude-result-prefixes="ns #default">
  <xsl:strip-space elements="*"/>
  <xsl:output indent="yes"/>



  <!--Identity template: default copy all content into the output -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <!-- Don't copy tags called 'node or edge' -->
  <xsl:template match="ns:node" />
  <xsl:template match="ns:edge" />



  <xsl:template match="ns:node[ns:data[@key='username' and . = 'c']]">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>

    <xsl:variable name="USERID" select="@id"/>

    <xsl:for-each select="//ns:edge"> 

      <xsl:if test="@source=$USERID">

        <xsl:variable name="TARGET" select="@target"/>
        <xsl:for-each select="//ns:node[@id=$TARGET]">
          <!-- finds USERNAME's hobbies -->

          <xsl:for-each select="//ns:edge[@source=$USERID and @label='HAS_HOBBY']">
            <xsl:variable name="HOBBYTARGET" select="@target"/>
            <xsl:for-each select="//ns:edge[@source=$TARGET and @label='HAS_HOBBY']">
              <xsl:if test="@target=$HOBBYTARGET">
                <!-- Shared hobby with friend -->
                <xsl:for-each select="//ns:node[@id=$HOBBYTARGET]">
                  <xsl:copy>
                    <xsl:apply-templates select="node()|@*"/>
                  </xsl:copy>
                </xsl:for-each>


              </xsl:if>
            </xsl:for-each>  
          </xsl:for-each>
        </xsl:for-each>
      </xsl:if>

      <xsl:if test="@target=$USERID">

        <xsl:variable name="SOURCE" select="@source"/>
        <xsl:for-each select="//ns:node[@id=$SOURCE]">
          <!-- finds USERNAME's hobbies -->

          <xsl:for-each select="//ns:edge[@source=$USERID and @label='HAS_HOBBY']">
            <xsl:variable name="HOBBYTARGET" select="@target"/>
            <xsl:for-each select="//ns:edge[@source=$SOURCE and @label='HAS_HOBBY']">
              <xsl:if test="@target=$HOBBYTARGET">
                <!-- Shared hobby with friend -->
                <xsl:for-each select="//ns:node[@id=$HOBBYTARGET]">
                  <xsl:copy>
                    <xsl:apply-templates select="node()|@*"/>
                  </xsl:copy>
                </xsl:for-each>


              </xsl:if>
            </xsl:for-each>  
          </xsl:for-each>

        </xsl:for-each>
      </xsl:if>
    </xsl:for-each>

  </xsl:template>

</xsl:stylesheet>

At the moment the friend's copy is missing but it would be right after the "Shared hobby with friend" comment.

I realised i cant use a 'flag' type variable (since its not possible..) and there's no way to have arrays or some similar data structure, so im really out of ideas.

Please, help me to get a user's friends that he shares atleast one hobby (item) with, and the hobbies themselves.

EDIT: Sample Input: I added graph visualisation as well so its easy to see

enter image description here

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<graph id="G" edgedefault="directed">

<node id="n2" labels=":Item"><data key="labels">:Item</data><data key="itemId">Q1</data></node>
<node id="n32" labels=":Item"><data key="labels">:Item</data><data key="itemId">Q8</data></node>
<node id="n51" labels=":Item"><data key="labels">:Item</data><data key="itemId">Q23</data></node>
<node id="n897" labels=":Item"><data key="labels">:Item</data><data key="itemId">Q55</data></node>

<node id="n406727" labels=":User"><data key="labels">:User</data><data key="hobbies">[Ljava.lang.String;@78ba00a3</data><data key="firstName">a</data><data key="imgPath">/uploads/a.png</data><data key="surName">a</data><data key="username">a</data><data key="gender">Male</data><data key="relaStatus">Single</data></node>
<node id="n406729" labels=":User"><data key="labels">:User</data><data key="hobbies"></data><data key="firstName">b</data><data key="imgPath">/uploads/b.png</data><data key="surName">b</data><data key="username">b</data><data key="gender">Male</data><data key="relaStatus">Single</data></node>
<node id="n406731" labels=":User"><data key="labels">:User</data><data key="hobbies"></data><data key="blocked">[Ljava.lang.String;@7b800b40</data><data key="firstName">c</data><data key="imgPath">/uploads/c.png</data><data key="surName">c</data><data key="username">c</data><data key="gender">Male</data><data key="relaStatus">Single</data></node>
<node id="n406734" labels=":User"><data key="labels">:User</data><data key="hobbies"></data><data key="firstName">d</data><data key="imgPath">/uploads/d.png</data><data key="surName">d</data><data key="username">d</data><data key="gender">Male</data><data key="relaStatus">Single</data></node>

<edge id="e1223400" source="n406727" target="n406729" label="FRIEND_OF"><data key="label">FRIEND_OF</data></edge>
<edge id="e1223403" source="n406727" target="n406731" label="FRIEND_OF"><data key="label">FRIEND_OF</data></edge>
<edge id="e1223405" source="n406734" target="n406731" label="FRIEND_OF"><data key="label">FRIEND_OF</data></edge>
<edge id="e1223405" source="n406727" target="n406734" label="FRIEND_OF"><data key="label">FRIEND_OF</data></edge>

<edge id="e1223374" source="n406727" target="n2" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223385" source="n406727" target="n51" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223383" source="n406729" target="n2" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223384" source="n406731" target="n2" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223375" source="n406731" target="n51" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223371" source="n406734" target="n897" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>

</graph>
</graphml>

And here's the sample output. You can see that only c and b are left in the result since they have common hobbies (items with Q) with a. so d, the edge a-d and Q51, Q8 are gone.

enter image description here

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<graph id="G" edgedefault="directed">

<node id="n2" labels=":Item"><data key="labels">:Item</data><data key="itemId">Q1</data></node>
<node id="n51" labels=":Item"><data key="labels">:Item</data><data key="itemId">Q23</data></node>

<node id="n406727" labels=":User"><data key="labels">:User</data><data key="hobbies">[Ljava.lang.String;@78ba00a3</data><data key="firstName">a</data><data key="imgPath">/uploads/a.png</data><data key="surName">a</data><data key="username">a</data><data key="gender">Male</data><data key="relaStatus">Single</data></node>
<node id="n406729" labels=":User"><data key="labels">:User</data><data key="hobbies"></data><data key="firstName">b</data><data key="imgPath">/uploads/b.png</data><data key="surName">b</data><data key="username">b</data><data key="gender">Male</data><data key="relaStatus">Single</data></node>
<node id="n406731" labels=":User"><data key="labels">:User</data><data key="hobbies"></data><data key="blocked">[Ljava.lang.String;@7b800b40</data><data key="firstName">c</data><data key="imgPath">/uploads/c.png</data><data key="surName">c</data><data key="username">c</data><data key="gender">Male</data><data key="relaStatus">Single</data></node>

<edge id="e1223400" source="n406727" target="n406729" label="FRIEND_OF"><data key="label">FRIEND_OF</data></edge>
<edge id="e1223403" source="n406727" target="n406731" label="FRIEND_OF"><data key="label">FRIEND_OF</data></edge>
<edge id="e1223405" source="n406734" target="n406731" label="FRIEND_OF"><data key="label">FRIEND_OF</data></edge>

<edge id="e1223374" source="n406727" target="n2" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223385" source="n406727" target="n51" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223383" source="n406729" target="n2" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223384" source="n406731" target="n2" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>
<edge id="e1223375" source="n406731" target="n51" label="HAS_HOBBY"><data key="label">HAS_HOBBY</data></edge>

</graph>
</graphml>

Thank you for your time.

Edit#2: Added data for label nodes and hasLabel edges:

<node id="n3" labels=":Label"><data key="labels">:Label</data><data key="en-gb">Universe</data>
<edge id="e0" source="n2" target="n3" label="hasLabel"><data key="label">hasLabel</data></edge>

This edge connects the node n2 which has the itemId of Q1 to the node n3 which has its label, "Universe".


Solution

  • Here is an example using XSLT 2.0 (as supported by Saxon 9, XmlPrime, Altova, Exselt) using keys to reference the items and then set operations like intersect to only output shared nodes:

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        exclude-result-prefixes="xs"
        xpath-default-namespace="http://graphml.graphdrawing.org/xmlns"
        version="2.0">
    
    <xsl:param name="user-name" as="xs:string" select="'c'"/>
    
    <xsl:output indent="yes"/>
    
    <xsl:key name="user-name" match="node[@labels = ':User']" use="data[@key = 'username']"/>
    
    <xsl:key name="node-id" match="node" use="@id"/>
    
    <xsl:key name="source-friends" match="edge[@label = 'FRIEND_OF']" use="@source"/>
    <xsl:key name="target-friends" match="edge[@label = 'FRIEND_OF']" use="@target"/>
    <xsl:key name="source-hobbies" match="edge[@label = 'HAS_HOBBY']" use="@source"/>
    
    <xsl:variable name="start-node" select="key('user-name', $user-name)"/>
    
    <xsl:variable name="start-friends"
                   select="key('node-id', key('source-friends', $start-node/@id)/@target) |
                           key('node-id', key('target-friends', $start-node/@id)/@source)"/>
    
    <xsl:variable name="start-hobbies" select="key('node-id', key('source-hobbies', $start-node/@id)/@target)"/>
    
    <xsl:variable name="friends-with-shared-hobby" select="$start-friends[key('node-id', key('source-hobbies', @id)/@target) intersect $start-hobbies]"/>
    
    <xsl:variable name="shared-hobbies" select="$start-hobbies intersect key('node-id', key('source-hobbies', $friends-with-shared-hobby/@id)/@target)"/>
    
    <xsl:template match="/*">
        <xsl:copy>
            <xsl:copy-of select="$start-node | $friends-with-shared-hobby | $shared-hobbies"/>
        </xsl:copy>
    </xsl:template>
    
    </xsl:stylesheet>