Search code examples
xquerymarklogic

Trouble understanding CTS queries using XQuery in Marklogic


I am trying to understand difference between cts:element-query, cts:element-value-query and cts:element-word-query using cts:search().

When someone can achieve the same thing using all three why did they created these many?

I am sure I am missing something here to understand. I have following data:

<CATALOG>
  <CD>
    <TITLE>Empire Burlesque</TITLE>
    <ARTIST>Bob Dylan</ARTIST>
    <COUNTRY>USA</COUNTRY>
    <COMPANY>Columbia</COMPANY>
    <PRICE>10.90</PRICE>
    <YEAR>1985</YEAR>
  </CD>
  <CD>
    <TITLE>Hide your heart</TITLE>
    <ARTIST>Bonnie Tyler</ARTIST>
    <COUNTRY>USA</COUNTRY>
    <COMPANY>CBS Records</COMPANY>
    <PRICE>9.90</PRICE>
    <YEAR>1988</YEAR>
  </CD>
  <CD>
    <TITLE>Greatest Hits</TITLE>
    <ARTIST>Dolly Parton</ARTIST>
    <COUNTRY>EU</COUNTRY>
    <COMPANY>RCA</COMPANY>
    <PRICE>9.90</PRICE>
    <YEAR>1982</YEAR>
  </CD>
</CATALOG>

I want to filter the data for country say "EU". I can achieve the same thing with any query listed below.

  • cts:search(//CD,cts:element-query(xs:QName("COUNTRY"),"EU"))
    
  • cts:search(//CD,cts:element-value-query(xs:QName("COUNTRY"),"EU"))    
    
  • cts:search(//CD,cts:element-word-query(xs:QName("COUNTRY"),"EU"))
    

So what is the difference? When to use what? Can someone help me understand?

My understand was to use cts:search with cts:element-query. I was researching with the other queries if I can get the same thing using other queries too. (I have gone thru the documentation I still don't understand). Can someone please give me a simple explanation?


Solution

  • Those three cts:element-* query functions have some overlapping functionality, and it is possible to get the same results, but there are some key differences that affect what is possible and how efficient the query may be for your system.

    • cts:element-query() is a container query. It matches the element specified in the first parameter. The query from second parameter is applied to the matched element and all of its descendants. So the cts:word-query would match the text of COUNTRY or any descendant elements, if there were a more complex structure.

      Using xdmp:plan() to see the query plan,

      xdmp:plan(cts:search(//CD,cts:element-query(xs:QName("COUNTRY"),"EU")))
      

      you can see the plan has criteria with an unconstrained word-query being applied:

      <qry:term-query weight="1">
        <qry:key>17785254954065741518</qry:key>
        <qry:annotation>word("EU")</qry:annotation>
      </qry:term-query>
      
    • cts:element-value-query() only matches against simple elements (that is, elements that contain only text and have no element children) with text content matching the phrase from the second parameter.

      The xdmp:plan() for that query:

      xdmp:plan( cts:search(//CD,cts:element-value-query(xs:QName("COUNTRY"),"EU")) )
      

      reveals that there is a value being applied specifically to the COUNTRY element:

      <qry:term-query weight="1">
        <qry:key>9358511946618902997</qry:key>
        <qry:annotation>element(COUNTRY,value("EU"))</qry:annotation>
      </qry:term-query>
      
    • cts:element-word-query() is similar to a cts:element-value-query except that it searches only through immediate text node children of the specified element as well as any text node children of child elements defined in the Admin Interface as element-word-query-throughs or phrase-throughs. It does not search through any other children of the specified element.

      The xdmp:plan() for that query:

      xdmp:plan( cts:search(//CD,cts:element-word-query(xs:QName("COUNTRY"),"EU")) )
      

      shows that there is a word query applied specifically to the COUNTRY element:

      <qry:term-query weight="1">
        <qry:key>6958980695756965065</qry:key>
        <qry:annotation>element(COUNTRY,word("EU"))</qry:annotation>
      </qry:term-query>
      

      cts:element-word-query is most helpful if you had mixed content and a known vocabulary of specific elements that you want to be able to "see through" when searching. One example is MS Word or XHTML markup in which there are elements wrapping text that are used for applying styling and formatting, such as <b>, <i>, and <u> inside of a <p> and you wanted to search for a word in a given paragraph and search through the b, i, and u child elements.

    For this specific instance, looking to search for a value in a specific element, you should use:

    cts:search(//CD,cts:element-value-query(xs:QName("COUNTRY"),"EU")) 
    

    It is the most specific and efficient means of telling MarkLogic that you want to search for the value "EU" in the COUNTRY element (and not any of it's children or descendants).