Search code examples
marklogicmarklogic-8marklogic-9

Search Xpath with its value in XML Documents and Get Document URI From Marklogic


I have XML stored in MarkLogic as mentioned below

<testData>
  <datatypes>
    <datatypename>datatypename1</datatypename>
    <datatype>datatype1</datatype>
  </datatypes>
  <datavalue>
    <code>code1</code>
    <value>value1</value>
  </datavalue>
  <datavalue>
    <code>code2</code>
    <value>value2</value>
  </datavalue>
  <datavalue>
    <code>code3</code>
    <value>value3</value>
  </datavalue>
</testData>

and there is Possibility that Above values and document structure may Appear in another XML but with different URI.

So My Requirement is, I need all the document URIs from the MarkLogic which contains /testData/datatypes/datatypename as exact xpath and its exact value as datatypename1


Solution

  • The most straightforward approach is to create a path range index on /testData/datatypes/datatypename and use a path range query:

    cts:path-range-query("/testData/datatypes/datatypename", "=", "datatypename1")
    

    An alternative is to use a values query within scoping element queries:

    cts:element-query(xs:QName("testData"), 
      cts:element-query(xs:QName("datatypes"), 
        cts:element-value-query(xs:QName("datatypename"), "datatypename1"))) 
    

    This approach is more susceptible to false positives. For small candidate result sets, such false positives can be mitigated by filtering.

    For large candidate results sets, positional false positives (such as a document that has testData/datatypes and datatypes/datatypename but not testData/datatypes/datatypename) can be eliminated by indexing element positions. When punctuation is distinctive for the match, tokenization of the value will also produce false positives. In such cases, path range indexes are the correct solution for large candidate result sets.