Search code examples
regexxquerymarklogic

Pattern or Format Match in XQuery MarkLogic


I am looking for given string, it has to be in *(*) format, * should not have space, no two words before (.

I am searching MarkLogic DB to see if given column value is in [^\s]+\((?!\s)[^()]+(?<!\s)\) format, if not replace it with this format.

I am still stuck at fetching data, and could not write the query to update

I am searching DB as

    let $query-opts := cts:search(doc(),
      cts:and-query((
        cts:directory-query(("/xyz/documentData/"),"1"),  
            cts:element-query( 
                xs:QName("cd:clause"),  (: <clause> element inside extended for checking query id :)
                cts:and-query((
                    cts:element-attribute-value-query( xs:QName("cd:clause"), xs:QName("tag"), "Title" ),  (: only if the <clause> is of type "Title" :)
                    cts:element-attribute-value-query( xs:QName("cd:xmetadata"), xs:QName("tag"), "Author")

                ))
             )
        ))
for $d in $query-opts
return (
     for $x in $d//cd:document/cd:clause/cd:xmetadata[fn:matches(@tag,"Author")]/cd:metadata_string
     where fn:matches($x/string(), "[^\s]+\((?!\s)[^()]+(?<!\s)\)")
       return 
       (   <documents> {
      <documentId> {$d//cd:cdf/cd:documentId/string()}</documentId>
     }</documents>
       )
     )

It's throwing up error invalid pattern


Solution

  • The fn:matches function does not support group modifiers like (?! and (?<!. Simplify your pattern, and capture false positives after the match with another match if necessary.

    Doing an educated guess at what you are trying to do, I think you are looking for something like:

    where fn:matches($x, '^.+\([^)]+\).*$') (: it uses parentheses :)
      and fn:not(fn:matches($x, '^[^\s]+\([^\s)]+\)$')) (: but does not comply to strict rules :)
    

    HTH!