Search code examples
marklogicmarklogic-9

Unable to find xdmp:plan result documentation


I am having a hard time finding details about what the output of xdmp:plan means.

Having a simple query like this:

xdmp:plan(cts:search(doc(), cts:element-value-query(xs:QName("description"), "some text")))

results in quite a long execution plan:

<qry:query-plan xmlns:qry="http://marklogic.com/cts/query">
<qry:expr-trace>...</qry:expr-trace>
...
<qry:partial-plan>
  <qry:term-query weight="1">
      <qry:key>16037778974159125508</qry:key>
      <qry:annotation>element(description,value("some","text"))</qry:annotation>
  </qry:term-query>
</qry:partial-plan>
...
<qry:ordering></qry:ordering>
<qry:final-plan>
  <qry:and-query>
    <qry:term-query weight="1">
      <qry:key>16037778974159125508</qry:key>
      <qry:annotation>element(description,value("some","text"))</qry:annotation>
    </qry:term-query>
  </qry:and-query>
</qry:final-plan>
<qry:info-trace>Selected 0 fragments to filter</qry:info-trace>
<qry:result estimate="0"></qry:result>
</qry:query-plan>

The only part of the documentation mentioning xdmp:plan is its documentation itself. Other than that i can not find anything else. I'd like some details about what e.g. qry:key or qry:annotation really mean.

Is there any documentation i am missing describing the possible output of xdmp:plan. As this is a really valuable tool in order to understand query performance, i expected it to be rather well documented.


Edit: This marklogic blog post i found gives some examples of how a query plan can be interpreted.

Still, i feel like a blog post should not be the only reasonable documentation for this tool.

Some questions still on my mind:

  • Whats the difference between a partial-plan and a final-plan. Is a final-plan a merge of all partial-plans? For what and when is a partial-plan used? Partial-plans seem to contribute constraints. Are these constraints used at index resolution stage to find candidate fragment ids? What role does a final-plan play there? Is a final-plan used to filter out false-positives after the index resolution ?

Sometimes i can find this in the query plan:

<qry:elem-word-trace text="computer" elem-name="title" elem-uri="">
   <qry:key>10975994818398622042</qry:key>
</qry:elem-word-trace>
  • What does a qry:elem-word-trace mean?
  • What about <qry:ordering></qry:ordering>? Added a simple description about ordering to my answer.
  • A simple xpath query like /doc[id = 1] outputs the following 2 times:

Is there a reason for that ? Why does step 2 predicate 1 contribute the same partial-plan twice?

<qry:info-trace>Step 2 predicate 1 contributed 1 constraint: id = 1</qry:info-trace>
<qry:partial-plan xmlns:qry="...">...</qry:partial-plan>
<qry:info-trace>Step 2 predicate 1 contributed 1 constraint: id = 1</qry:info-trace>
<qry:partial-plan xmlns:qry="...">...</qry:partial-plan>

Solution

  • After some more searching and reading i decided to summarize my findings.

    Note: If you are not using fragmentation, every use of "fragment" can be put on par with "document".

    Partial vs Full-Plan

    A partial-plan just shows the incremental pieces of the plan as the come in and seem to mostly be just for informational use.

    The full-plan on the other hand is the request how it is sent to the index and thus most of the time the interesting part.

    Selected x fragments

    The documentation of query-trace gives some insight of what the info-trace messages mean:

    Having a filtered query results in a info-trace describing how many candidate fragments references were returned from the index resolution stage of query processing:

    xdmp:plan(cts:search(doc(), cts:element-word-query(xs:QName("title"), "computer")))
    => ...
    <qry:info-trace>Selected 2 fragments to filter</qry:info-trace>
    

    A unfiltered query logs the same message but without the "to filter" indicating that the second filtering step is not executed:

    xdmp:plan(cts:search(doc(), cts:element-word-query(xs:QName("title"), "computer"), ("unfiltered")))
    => ...
    <qry:info-trace>Selected 2 fragments</qry:info-trace>
    

    qry:result

    <qry:result estimate="2"></qry:result>
    

    The estimate in qry:result shows how many fragments match the query using the index information alone. So this is a estimated number before the filtering step, thus might contain false-positives. I think the value of estimate and the log of info-trace described above is always the same.


    different annotation examples

    Having a element-word-query with the only word searches enabled (fast element word searches disabled) returns this final-plan:

    xdmp:plan(cts:search(doc(), cts:element-word-query(xs:QName("title"), "computer")))
    => ...
    <qry:final-plan>
       <qry:and-query>
          <qry:term-query weight="1">
             <qry:key>13967911917401594192</qry:key>
             <qry:annotation>word("computer")</qry:annotation>
          </qry:term-query>
          <qry:term-query weight="0">
             <qry:key>745773915438417736</qry:key>
             <qry:annotation>element(title)</qry:annotation>
          </qry:term-query>
       </qry:and-query>
    </qry:final-plan>
    

    Having two separate term-queries with one word("computer") and one element(title) means it will also return documents containing the word "computer" outside of element title. So a unfiltered search could return false-positives.

    Having a element-word-query with both word searches and fast element word searches enabled returns this final-plan:

    <qry:final-plan>
       <qry:and-query>
          <qry:term-query weight="1">
             <qry:key>10975994818398622042</qry:key>
             <qry:annotation>element(title,word("computer"))</qry:annotation>
          </qry:term-query>
       </qry:and-query>
    </qry:final-plan>
    

    Here annotation indicates a combined search for word "computer" inside the title element. This query could be unfiltered and still return no false-positives in my case.

    More detailed information in this blog post.


    qry:ordering

    That <qry:ordering> tag indicates that the resulting candidate fragment references are getting ordered. This can be controlled with one of the cts:order Constructors in the cts:search function. Example:

    xdmp:plan(
      cts:search(
        doc(), 
        cts:element-word-query(xs:QName("title"), "computer"), 
        (cts:unordered())
    ))
    =>....
    <qry:ordering>
        <qry:unordered></qry:unordered>
    </qry:ordering>
    

    How to see, if a index is used

    I always wondered, how to see if a index is used or not (being used to query execution plans where you have like a full index scan). Ultimately you can tell quite easy if a index is used or not:

    Search for <qry:info-trace> logs, which contain searchable. Messages which contain searchable are good, meaning this part of your query can be executed using a index. If it contains the word unsearchable, this might be a bad sign.

    The log message for xdmp:plan(//image/id[. = "1"]/..) could look like this:

    <qry:info-trace>Analyzing path: fn:collection()/descendant::image/id[. = "1"]/..</qry:info-trace>
    <qry:info-trace>Step 1 is searchable: fn:collection()</qry:info-trace>
    <qry:info-trace>Step 2 is searchable: descendant::image</qry:info-trace>
    <qry:info-trace>Step 3 is searchable: id[. = "1"]</qry:info-trace>
    <qry:info-trace>Step 4 axis is unsearchable: parent</qry:info-trace>
    <qry:info-trace>Step 4 is unsearchable: ..</qry:info-trace>
    

    Meaning all parts except Step 4, the /.. can be resolved by the index. This might not be a bad sign, depending on your query. In this case, the query could be modified though:

    This slightly modified query can use the index for all "steps" xdmp:plan(//image[id = "1"]);

    <qry:info-trace>Analyzing path: fn:collection()/descendant::image[id = "1"]</qry:info-trace>
    <qry:info-trace>Step 1 is searchable: fn:collection()</qry:info-trace>
    <qry:info-trace>Step 2 is searchable: descendant::image[id = "1"]</qry:info-trace>
    <qry:info-trace>Path is fully searchable.</qry:info-trace>
    

    Further details can be found here.


    If someone finds more information on how to interpret and work with xmdp:plan output i'd be happy to know about it.

    Update 17.11.2018:

    Found this really interesting video where Mary Holstege talks about MarkLogic Search and Indexes. This covers a whole lot of my questions and i can really recommend it.