The purpose of my structured query is to match all documents which have json property with value containing specific substring. So I wrote following code which builds this query with MarkLogic Java API:
var jsonProperty = queryBuilder.jsonProperty("xyz");
String[] wordOptions = {"case-insensitive", "wildcarded"};
return queryBuilder.word(jsonProperty, null, wordOptions, 0, "*m-Em i*");
For some reason there are more search matches than expected. For example document with "xyz" json property containing "PM-EM 926-2:2020" is matched, but it shouldn't be. What might be the reason behind that problem?
I have also tried:
cts:search(fn:doc(), cts:json-property-word-query("xyz", "*m-Em I*", ("case-insensitive", "wildcarded")))
and it returns expected matches, but I would rather stick to structured query.
Do you get the same results if you add the "unfiltered" option to your cts:search()
?
"m-Em I" is not a word, it is a phrase that has -
punctuation char and a leading wildcard and I*
is a one character word with a trailing wildcard.
So, unless you have the necessary backing indexes, you are likely just searching for "Em" and then with cts:search filtering by default, getting more relevant results.
Take a look at the plan and see what your search winds up becoming:
xdmp:plan(cts:search(fn:doc(), cts:json-property-word-query("xyz", "*m-Em I*", ("case-insensitive", "wildcarded"))))
And take a look at the difference in results when applying "unfiltered" to the cts:search
, or by wrapping the search with xdmp:estimate()
to see what the unfiltered index resolved results would be before applying filtering.