Search code examples
javascriptmarklogic

Bulk delete documents based on a pattern Marklogic/java script


I'm trying to bulk delete documents based on a pattern but, since the collection contains 500K plus documents, the forloop seem to get hung. Below is my code:

for (const uri of cts.uris("", null, cts.jsonPropertyValueQuery("source", "survey"))) {
  xdmp.documentDelete(uri);
}

Can somebody help me with a better way to delete documents in MarkLogic when there is large volume?


Solution

  • Attempting a "boil the ocean" type query to modify a really large set of documents in a single transaction is likely to run into limits, either in execution time, transaction size, expanded tree cache, etc.

    It is best to break out the work into smaller units.

    One easy way is to spawn the work into multiple transactions that get executed on the task server. You can do this rather easily in XQuery with xdmp:spawn-function() (unfortunately, the equivalent function is not available in SJS):

    xquery version "1.0-ml";
    for $URI in cts:uris("", (), cts:json-property-value-query("source", "survey"))
    return xdmp:spawn-function(function(){ xdmp:document-delete($URI) })
    

    You could modify the code above to delete subsets of URIs, instead of one at a time. However, those delete transactions should execute very quickly (and in parallel for as many threads as you have configured for the task server), so it may not be worth the hassle of more complicated code.

    Another option is to use a batch tool, such as CoRB