Search code examples
marklogicmarklogic-9

Data Movement Manager using combined cts queries


I wanted to use a combined search query like the one documented here with a QueryBatcher. But i don't seem to get the results i expected. This is how my query looks:

<search xmlns="http://marklogic.com/appservices/search" xmlns:cts="http://marklogic.com/cts">
    <cts:element-word-query>
        <cts:element>id</cts:element>
        <cts:text>2</cts:text>
    </cts:element-word-query>
</search>

Using a simple QueryManager, this returns me a total count of, well lets say 50.

final QueryManager qMngr = client.newQueryManager();

final RawStructuredQueryDefinition query = qMngr.newRawStructuredQueryDefinition(new StringHandle().with("" +
       "<search xmlns=\"http://marklogic.com/appservices/search\" xmlns:cts=\"http://marklogic.com/cts\">" +
       "   <cts:element-word-query xmlns:cts=\"http://marklogic.com/cts\"><cts:element>id</cts:element><cts:text>2</cts:text></cts:element-word-query>" +
        "</search>").withFormat(Format.XML)
        );
// prints 50
System.out.println("Count by search: "+ qMngr.search(query, new SearchHandle()).getTotalResults());

Using a QueryBatcher with this query, i get returned every document in my database. QueryBatcher doesn't seem to use my query filter at all:

DataMovementManager dmm = client.newDataMovementManager();
QueryBatchListener listener = (a) -> System.out.println(a.getItems().length);
QueryBatcher queryBatcher = dmm
    .newQueryBatcher(query)
    .onUrisReady(listener);

dmm.startJob(queryBatcher);
queryBatcher.awaitCompletion();
// prints a few lines with 1000 and a few with some smaller number. 
// But WAY more than expected (50!) using the same query as before

So i digged into there QueryBatcher code an noticed this call.

UrisHandle results = queryMgr.uris(query, handle, start, null, forest.getForestName())

This is a call to a internal api in order to get all the uris for the onUrisReady listener. This seems to not use a combined cts query:

final Iterator<String> iterator = ((QueryManagerImpl) qMngr).uris(query, new UrisHandle(), 0, null, "my-forest").iterator();

int count = 0;
while (iterator.hasNext()) {
   iterator.next();
   count++;
}
// prints 1000
System.out.println("By uris: " + count);

Edit: Using a combined query with a structured query actually works, but i sadly cannot use this:

final QueryManager qMngr = client.newQueryManager();
final StructuredQueryBuilder sqb = qMngr.newStructuredQueryBuilder();
final RawStructuredQueryDefinition query = qMngr.newRawStructuredQueryDefinition(new StringHandle().with("" +
        "<search xmlns=\"http://marklogic.com/appservices/search\">" +
            sqb.word(sqb.element("id"), "2").serialize() +
        "</search>").withFormat(Format.XML)
    );

DataMovementManager dmm = client.newDataMovementManager();
QueryBatchListener listener = (a) -> System.out.println(a.getItems().length);
QueryBatcher queryBatcher = dmm
    .newQueryBatcher(query)
    .onUrisReady(listener);

dmm.startJob(queryBatcher);
queryBatcher.awaitCompletion();
// returns 50 (in total, in multiple listener calls)

Is this a known bug or am i doing something wrong here ?

  • Java client: 4.1.0
  • MarkLogic: 9.0-6

Solution

  • The fix for this bug will appear in a future release once testing confirms the implementation.

    Here's the issue in the GitHub repository:

    https://github.com/marklogic/java-client-api/issues/965