Is there a way to get the node that contains the search snippet for eg:-
I have a sample xml doc
<pdf2xml>
<page pageNo="1">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</page>
<page pageNo="2">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</page>
<page pageNo="3">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</page>
<page pageNo="4">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</page>
</pdf2xml>
How do I get the pageNo for a given search result ? I tried the following
search:snippet(fn:doc($uri),
cts:query(search:parse($q, $options)),
<transform-results apply="snippet" xmlns="http://marklogic.com/appservices/search">
<per-match-tokens>30</per-match-tokens>
<max-matches>1000</max-matches>
<max-snippet-chars>2000</max-snippet-chars>
<preferred-matches>
<element name="page" ns=""/>
</preferred-matches>
</transform-results>)
This does not give all the snippets as well ... what is a good a way of doing what I want to do ?
Looking for all of the snippets in a document, returning the containing element and highlighting them can be done with cts:walk and cts:snippet
xquery version "1.0-ml";
let $content := <pdf2xml>
<page pageNo="1">xxxxxxxxxxxxxx 1 xxxxxxxxx</page>
<page pageNo="2">xxxxxxxxxxxxxx 2 xxxxx foo xxxxxxxx</page>
<page pageNo="3">xxxxxxxxxxxxxxx 3 xxxxxxxxxxxxxxxxxxxxxxx</page>
<page pageNo="4">xxxxxxxxxxxxxxxxx 4 xxxxxxxxxxx foo xxxxxxxxxx</page>
</pdf2xml>
let $q := cts:word-query("foo")
return <results>
{cts:walk($content, $q ,
<result>
<original-node>{$cts:node/parent::*}</original-node>
<highlighted-content>{cts:highlight($cts:node/parent::*, $q, <matched>{$cts:text}</matched>)}</highlighted-content>
</result>
)}
</results>
Results in:
<results>
<result>
<original-node>
<page pageNo="2">xxxxxxxxxxxxxx 2 xxxxx foo xxxxxxxx</page>
</original-node>
<highlighted-content>
<page pageNo="2">xxxxxxxxxxxxxx 2 xxxxx <matched>foo</matched> xxxxxxxx</page>
</highlighted-content>
</result>
<result>
<original-node>
<page pageNo="4">xxxxxxxxxxxxxxxxx 4 xxxxxxxxxxx foo xxxxxxxxxx</page>
</original-node>
<highlighted-content>
<page pageNo="4">xxxxxxxxxxxxxxxxx 4 xxxxxxxxxxx <matched>foo</matched> xxxxxxxxxx</page>
</highlighted-content>
</result>
</results>
This may not be what you want, but I still offer it up as an example of some of the power you have available to you for manipulating your results (or in the example, extracting and highlighting content as a result of search or not)