Search code examples
xqueryfunctx

Using functx:index-of-match-first in XQuery to return a substring of a text node


I am trying to write an XQuery which would find all text nodes that contain a given key word in an xml file. The text node is long so I would like to return a substring(of a desired length) of the text starting at the matching key word.

Samplefile.xml

<books>
<book>
  <title>linear systems</title>
  <content>vector spaces and linear system analysis </content>
</book>
<book>
  <title>some title</title>
  <content>some content</content>
</book>
</books>

samplexquery.xq

declare namespace functx = "http://www.functx.com";

for $match_result in /*/book/*[contains(.,'linear')]/text()
  return substring($match_result, functx:index-of-match-first($match_result,'linear'), 50)

I expect to get the result [linear systems, linear system analysis]. The title node of first book contains the word 'linear'. Return 50 characters starting from 'linear....'. Similarly for the content node of first book.

I am using XQuery 1.0 and I included the namespace functx as shown in the example at: http://www.xqueryfunctions.com/xq/functx_index-of-match-first.html

But, this is giving me an error: [XPST0017] Unknown function "functx:index-of-match-first(...)".

Thanks, Sony


Solution

  • I am using XQuery 1.0 and I included the namespace functx as shown in the example at: http://www.xqueryfunctions.com/xq/functx_index-of-match-first.html

    But, this is giving me an error: [XPST0017] Unknown function "functx:index-of-match-first(...)".

    It is not sufficient to only declare the namespace.

    You must also have the code of the function. Only the standard XQuery and XPath functions and operators are predefined in the language.

    This corrected code:

    declare namespace functx = "http://www.functx.com"; 
    declare function functx:index-of-match-first 
      ( $arg as xs:string? ,
        $pattern as xs:string )  as xs:integer? {
    
      if (matches($arg,$pattern))
      then string-length(tokenize($arg, $pattern)[1]) + 1
      else ()
     } ;
    
     for $match_result in /*/book/*[contains(.,'linear')]/text()
      return substring($match_result, functx:index-of-match-first($match_result,'linear'), 50)
    

    when applied on the provided XML document (with several non-well-formedness errors corrected):

    <books>
      <book>
        <title>linear systems</title>
        <content>vector spaces and linear system analysis </content>
      </book>
      <book>
        <title>some title</title>
        <content>some content</content>
      </book>
    </books>
    

    produces the expected result:

    linear systems linear system analysis
    

    It is a good practice to use the import module directive to import modules from existing libraries of functions.