Search code examples
xpathxquery

Synchronize graph with XML-tree and apply axes to them


I transformed a graph with cycles and multiple parents to XML such that I can use XQuery on it. The graph is on the left and the XML-tree is on the right. I transformed the graph by writing down all child nodes from the first node (node 1) and repeat that on the returned nodes until no more children exist or a node has already been visited (like node 2). Further more, I added the constraint, that all nodes with the same number have to be selected, if one of them is selected. (For example, if node 2 (child of 1) is selected, then we also have to select node 2 (child of 6) in the XML-tree.) The operations I can use on the graph are: getPatents, getChildren, readValue(node). In the graph, all information is stored in the node, and in the XML-tree all Information of a node is stored as attributes.

enter image description here

My Question: I want to synchronize both structures, such that I can apply an axis like ancestor (or descendant) on the graph and on the XML-tree and get the same result.(I can parse the graph with Python and the XML-tree with XQuery)

My Problem: If I select node 8 on the graph and apply the ancestor function, it'll return: 4, 5, 2, 1, 6, 3 (6 and 3 because of the cycle). The ancestor axis on the XML-tree would return (we have to select both 8s): 4, 5, 2, 1 (the second 2, (child of 6) would also be selected due to the constraint, but not node 6 and 3).

My Solution: Changing the ancestor axis such that it returns all parents of the selected nodes, then applies the constraint and then selects again all parents and so on. But this solution seems to be very complicated and inefficient. Is there any better way?

Thanks for your help


Solution

  • I think it is not that easy to solve that for that particular format and with XSLT/XQuery/XPath as the document order imposed by most step or except or intersect or the arbitrary order XQuery grouping gives make it hard to establish the nodes you want and in the order they are traversed, the easiest I could come up with is

    declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
    
    declare option output:method 'text';
    declare option output:item-separator ', ';
    
    declare variable $main-root := /;
    
    declare function local:eliminate-duplicates($nodes as node()*) as node()*
    {
        for $node at $p in $nodes
        group by $id := generate-id($node)
        order by head($p)
        return head($node)
    };
    
    declare function local:get-parents($nodes as element(node)*, $collected as element(node)*) as element(node)*
    {
      let $new-parents := 
        for $p in local:eliminate-duplicates($nodes ! ..)
        return $main-root//node[@value = $p/@value][not(. intersect $collected)]
      return
        if ($new-parents)
        then local:get-parents($new-parents, ($collected, $new-parents))
        else $collected
    };
    
    
    local:get-parents(//node[@value = 8], ()) ! @value ! string()
    

    https://xqueryfiddle.liberty-development.net/gWmuPs8 gives 4, 5, 2, 2, 1, 6, 3.

    How efficient that works will partly depend on any index used for the node[@value = $p/@value] comparison, in XSLT you could ensure that with a key (https://xsltfiddle.liberty-development.net/aiyneS), in database oriented XQuery processors probably with an attribute based index.