Search code examples
linuxmarklogicmarklogic-9

MarkLogic Filesystem Log entry


I am seeing some slow Marklogic cluster logs like below

2020-01-14 05:55:22.649 Info: Slow background cluster.background.clusters, 5.727 sec

2020-01-14 05:55:22.649 Info: Slow background cluster.background.hosts.AssignmentManager, 5.581 sec

I suspect MarkLogic filesystem is running slow and does not able to keep up with MarkLogic. I am seeing below log entry also:-

2020-01-14 05:55:53.380 Info: Linux file system mount option 'barrier' is default; recommend faster 'nobarrier' if storage has non-volatile write cache

I want to know what is the meaning of the above log entry in MarkLogic? How can I be sure that filesystem is having slowness problems or not?


Solution

  • The meaning of "slow messages" is that a background activity takes longer time than expected. It is an indicator of starvation.

    From your question it's impossible to say what is causing it. Typically, it's related to underlying physical infrastructure where MarkLogic is running. MarkLogic doesn't have its filesystem or other resources - it uses the OS's filesystem, memory etc. and if available physical resources are not enough for MarkLogic to serve the requested load, background operations will take longer time than expected. This will always be reflected in the log.

    You can read more here:

    Understanding "slow background" messages

    https://help.marklogic.com/Knowledgebase/Article/View/508/0/understanding-slow-infrastructure-notifications

    29 August 2019 10:54 AM

    Introduction

    In more recent versions of MarkLogic Server, "slow background" error log messages were added to note and help diagnose slowness.

    Details

    For "Slow background" messages, the system is timing how long it took to do some named background activity. These activities should not take long and the "slow background" message is an indicator of starvation. The activity can be slow because:

    it is waiting on a mutex or semaphore held by some other slow thread; the operating system is stalling it, possibly because it is thrashing because of low memory. Looking at the "slow background" messages in isolation is not sufficient to understand the reason - we just know a lot of time passed since the last time we read the time of day clock. To understand the actual cause, additional evidence will need to be gathered from the time of the incident.

    Notes:

    In general, we do not time how long it takes to acquire a mutex or semaphore as reading the clock is usually more expensive than getting a mutex or semaphore. We do not time things that usually take about a microsecond. We do time things that usually take about a millisecond. Related Articles

    Knowledgebase: Understanding Slow Infrastructure Notifications

    Knowledgebase: (Understanding slow 'journal frame' entries in the ErrorLog)[https://help.marklogic.com/Knowledgebase/Article/View/460/0/understanding-slow-journal-frame-entries-in-the-errorlog]

    Knowledgebase: (Hung Messages in the ErrorLog)[https://help.marklogic.com/Knowledgebase/Article/View/35/0/hung-messages-in-the-errorlog]