Search code examples
marklogicmarklogic-8

Return only URIs from cts:reverse-query


UPDATED: see end - an examle showing that cts:uris is not a valid approach as it does not return the right results in all cases.

I have a use-case where in MarkLogic I'll have sometimes hundreds of thousands of matches in a search result containing a cts:reverse-query. With this, All I want to return is the URIs of the documents matching the results so that I can cache them and process them via Corb2 later on.

Example Code:

xquery version "1.0-ml";


let $_ := xdmp:invoke-function(function(){
  for $val in ("foo", "bar", "baz")
    let $query := <query>{cts:element-word-query(xs:QName("what"), ($val))}</query>
    return (
              xdmp:document-insert("/test/reverse-" || $val ||".xml", $query, (), ("test-reverse")),
              xdmp:commit()
           )

},<options xmlns="xdmp:eval">
        <transaction-mode>update</transaction-mode>
      </options>
)


return for $result in cts:search(collection("test-reverse"), cts:reverse-query(<what>baz</what>))
  return xdmp:node-uri($result)

And this returns:

/test/reverse-baz.xml

Which is expected.

However, I feel like I am doing too much processing here since I already have a doc from cts:search(). But then again, since ML is Lazy, maybe even now, I really only have a reference since I've not accessed anything in the doc..?

What I would like is the use of cts:uris() to get the same result as above. However, you cannot use cts:reverse-query with cts:uris()

Yes, I understand that a reverse-query never necessarily need to be in the database as a document to use it(cts:contains example), so in some use-cases, URIs don;t even exist. But for me they do.

Also, I'm pretty sure I can use xdmp:plan() to generate the query needed for cts:uris() by pulling out the final-plan and re-writing the namespace to cts, but I'm not sure if this is any faster yet.

With the above, (2) questions below:

  1. For someone that understands the internals of MarkLogic, would you consider cts-search->for loop ->sdmo:node-uri() to be efficient and have little impact (in other words do I accpmplish this without expanding the document into the expanded tree cache?)
  2. If not, can you think of a more efficient way to mimic cts:uris in a faster way than I did above?

Why not cts:uris() - because I find NO matches, I get back everything:

xquery version "1.0-ml";


let $_ := xdmp:invoke-function(function(){
  for $val in ("foo", "bar", "baz")
    let $query := <query>{cts:element-word-query(xs:QName("what"), ($val))}</query>
    return (
              xdmp:document-insert("/test/reverse-" || $val ||".xml", $query, (), ("test-reverse")),
              xdmp:commit()
           )

},<options xmlns="xdmp:eval">
        <transaction-mode>update</transaction-mode>
      </options>
)


return cts:uris((),(),
  cts:and-query((
    cts:collection-query("test-reverse"),
    cts:reverse-query((<foo/>))
  )))

Returns:
/test/reverse-bar.xml
/test/reverse-baz.xml
/test/reverse-foo.xml

And the final sample showing the difference in results (validating that cts:uris does not work):

xquery version "1.0-ml";


let $_ := xdmp:invoke-function(function(){
  for $val in ("foo", "bar", "baz")
    let $query := <query>{cts:element-word-query(xs:QName("what"), ($val))}</query>
    return (
              xdmp:document-insert("/test/reverse-" || $val ||".xml", $query, (), ("test-reverse")),
              xdmp:commit()
           )

},<options xmlns="xdmp:eval">
        <transaction-mode>update</transaction-mode>
      </options>
)

let $uris-from-cts-uris := cts:uris((),(),
  cts:and-query((
    cts:collection-query("test-reverse"),
    cts:reverse-query(<foo/>)
  )))

let $uris-from-search := for $result in cts:search(collection("test-reverse"), cts:reverse-query(<foo/>))
  return xdmp:node-uri($result)

return 
(
  $uris-from-cts-uris,
  "xxxxxxxxxxxxxxxxx",
  $uris-from-search

) 

Which results in this:

/test/reverse-bar.xml
/test/reverse-baz.xml
/test/reverse-foo.xml
xxxxxxxxxxxxxxxxx

This reads: cts:uris just gave me the whole collection when it should have given me nothing and cts:search gave the right response.


Solution

  • Why can't you use cts:reverse-query() with cts:uris()?

    Try this:

    cts:uris((),(),
      cts:and-query((
        cts:collection-query("test-reverse"),
        cts:reverse-query(<what>baz</what>)
      )))