Search code examples
marklogicmarklogic-optic-api

Extracting data from a Table (MarkLogic)


I am currently trying to extract data out from a table to use it as a list of URI's for a XQuery CTS Search. Currently, I have created a optics query which returns data in a table-like format.

For example through using the following Optics query

import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
import module namespace ofn="http://marklogic.com/optic/expression/fn" at "/MarkLogic/optic/optic-fn.xqy";


let $people := op:from-sparql('SELECT * WHERE {?person </id> "000". ?person </path> ?animal}', "sparql")
                => op:select(( op:as('personStr',ofn:string(op:col('person'))), op:as('animalStr',ofn:string(op:col('animal'))) ))


return(
$people => op:result()
)

I will retrieve a table that looks like

personStr      |    animalStr
-------------------------------
/people/000         /animal/001
/people/000         /animal/002

Within this table will contain URI pointing to various documents, which I hope to extract (animalStr for instance) from and do some filtering through the usage of cts:search(fn:doc(uri_list), ....)

===Update with Current Approach===

let $people := op:from-sparql('SELECT * WHERE {?person </id> "000". ?person </path> ?animal}', "sparql")
                => op:select(( op:as('personStr',ofn:string(op:col('person'))), op:as('animalStr',ofn:string(op:col('animal'))) ))

let $animal := op:from-lexicons(
  map:entry("animal",cts:uri-reference()),
  "lexicon")
  =>op:where(
     cts:path-geospatial-query("animal_data/location",
    cts:circle(5500, cts:point(-55.854526273011, -151.93342455372309)),
    "type=long-lat-point")
  )

return(
$animal  => op:join-inner(
    $people,
    op:on(
      "animal","animalStr"
    )
  )
  => op:select(("personStr", "animalStr"))
  => op:result()
)

Based on this approach it can be seen that I am always required to retrieve all animals within a certain location before performing an inner-join to get my results. However, ideally I would like to apply the geospatial query directly to the results retrieved from the SPARQL query.


Solution

  • Optic can constrain a SPARQL query to the triples projected from documents that match a dynamic query including those documents matched by geospatial cts queries.

    The general pattern is:

    op.fromSparql(...SPARQL query...)
      .where(...cts query...)
    

    While the builder specifies the cts.query after the SPARQL query, the engine doesn't execute those steps sequentially. Instead, the engine ignores all triples from documents that don't match the cts.query.

    A constraining cts.query works the same way for rows and lexicon values.

    In fact, supplying a constraining cts.query with a where() clause is a best practice for improving the efficiency of retrieval. That's especially true for triples because triples are by definition join intensive.

    Is it possible that this pattern can meet the requirement?

    Hoping that helps,