Search code examples
xquerymarklogicmarklogic-9marklogic-optic-api

How do you join views with similar values with the Optic API?


I want to use the optic API to join two views. The join looks something like this:

xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";

let $qs := op:from-view("Samplestack", "QnA")
let $users := op:from-view("Samplestack", "Contributors")
return $qs
=> op:join-inner($users, op:on("Asker", "ContributorUserName"))
=> op:select(("QnATitle", "QVoteCount", "Asker", "UserReputation"))

The issue is that the two columns "ContributorUserName" and "Asker" are not completly identical. "Asker" has a prefix of "username.".

So i need to concat "username." and the value of "ContributorUserName".

According to the Docuementation i can call somehow the concat function within in the plan and therfore do not have to use the function op:call().

But i can not get it to work.


Solution

  • The query should be able to modify the view with a select() operation with an expression before the join, as in the following sketch (untested):

    => op:join-inner(
        $users => op:select((op:as("PrefixedUserName", ofn:concat((
            "username.",op:col("ContributorUserName")
            ))), ...otherNeededColumns...)), 
        op:on("Asker", "PrefixedUserName")
        )
    

    As you may well know already, the module must import the ofn module to get access to the concat() expression function:

    import module namespace ofn="http://marklogic.com/optic/expression/fn"
        at "/MarkLogic/optic/optic-fn.xqy";
    

    All Namspaces for the XQuery Libraries can be found here

    The only caveat is that (at least currently) joining on an expression forces the join to the enode instead of executing a distributed join on the dnodes.

    Thus, if the result set is large, it might improve the performance to perform the concatenation in the TDE, creating two columns if necessary and using whichever column is a appropriate for a query. Or, even better, trimming the prefix on the other Asker values in the TDE (again, creating two columns if necessary).

    Hoping that helps,