Search code examples
marklogicmarklogic-corb

MarkLogic 9 - Merging when running corb


The merging happens during corb process for adding nodes over 10 million documents. The CPU/IO reach to maximum cause timeout. if I blackout merging, the corb process will stop due to many stands. What can I do or resolve the issues.

  • I have 6 forests.
  • merge max size 49152
  • merge min size 1024
  • merge min ratio 3

Solution

  • Merging is a normal and expected process. You don't want to block it completely for an extended period of time, especially when loading lots of data, or you will hit the stand limits.

    It sounds as if your cluster isn't able to handle that load and size of the data. You might need to evaluate whether it is sized and configured appropriately. You haven't mentioned the specs of the server(s), how many servers in the cluster, etc.

    A couple of options to try and make this data loading less impactful:

    • Set a background-io limit

      This function sets a limit on the amount of I/O that background tasks (for example, merges) will consume. If the limit is reached, then merges are throttled back to limit their maximum I/O. This can help in situations when the I/O system on the computer is maxed out. In normal operations, you should not need to set this parameter.

    • Reduce the CoRB thread count to load them a little slower and allow the system to keep up with the rate that data is being pumped in
    • Occasionally pause the CoRB job to allow for merges to settle before resuming again. You can configure a COMMAND-FILE and pause/resume by changing the COMMAND option, or enable the UI by specifying the JOB-SERVER-PORT with an open port or range of ports to try to use. From the UI you can pause/resume and change the thread counts as the job is running.