Search code examples

Gets Nodes containing the Search snippet in a document

Is there a way to get the node that contains the search snippet for eg:-

I have a sample xml doc

  <page pageNo="1">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</page>
  <page pageNo="2">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</page>
  <page pageNo="3">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</page>
  <page pageNo="4">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</page>

How do I get the pageNo for a given search result ? I tried the following

  cts:query(search:parse($q, $options)),  
    <transform-results apply="snippet" xmlns="">
        <element name="page" ns=""/>

This does not give all the snippets as well ... what is a good a way of doing what I want to do ?


  • Looking for all of the snippets in a document, returning the containing element and highlighting them can be done with cts:walk and cts:snippet

    xquery version "1.0-ml";
    let $content := <pdf2xml>
      <page pageNo="1">xxxxxxxxxxxxxx 1 xxxxxxxxx</page>
      <page pageNo="2">xxxxxxxxxxxxxx 2 xxxxx foo xxxxxxxx</page>
      <page pageNo="3">xxxxxxxxxxxxxxx 3 xxxxxxxxxxxxxxxxxxxxxxx</page>
      <page pageNo="4">xxxxxxxxxxxxxxxxx 4 xxxxxxxxxxx foo xxxxxxxxxx</page>
    let $q := cts:word-query("foo")
    return <results> 
    {cts:walk($content, $q , 
        <highlighted-content>{cts:highlight($cts:node/parent::*, $q, <matched>{$cts:text}</matched>)}</highlighted-content>

    Results in:

          <page pageNo="2">xxxxxxxxxxxxxx 2 xxxxx foo xxxxxxxx</page>
          <page pageNo="2">xxxxxxxxxxxxxx 2 xxxxx <matched>foo</matched> xxxxxxxx</page>
          <page pageNo="4">xxxxxxxxxxxxxxxxx 4 xxxxxxxxxxx foo xxxxxxxxxx</page>
            <page pageNo="4">xxxxxxxxxxxxxxxxx 4 xxxxxxxxxxx <matched>foo</matched> xxxxxxxxxx</page>

    This may not be what you want, but I still offer it up as an example of some of the power you have available to you for manipulating your results (or in the example, extracting and highlighting content as a result of search or not)