Search code examples
javalogstashlogback

Logback: purpose and scope of StatusManager for Logstash-using microservice


According to the manual:

Logback collects its internal status data in a StatusManager object, accessible via the LoggerContext.

Given a StatusManager you can access all the status data associated with a logback context. To keep memory usage at reasonable levels, the default StatusManager implementation stores the status messages in two separate parts: the header part and the tail part. The header part stores the first H status messages whereas the tail part stores the last T messages. At present time H=T=150, although these values may change in future releases.

The purpose seems to be to allow incoming log messages to be listened-to, also viewed via a servlet.

In our case we use the logstash-logback-encoder and emit our Logback messages to the console so fluentd / Filebeat etc. can ship to a remote Logstash - and so our microservices have no need to inspect or listen to them.

So the question is: can / should we override the default BasicStatusManager (with a NOOP implementation) to suppress the internal storage of these messages, or does anyone know if the Logstash encoder depends upon, or hooks into this status mechanism itself?

Our logback.xmls are pretty simple:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <appender name="lsAppender" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
    </appender>

    <logger name="a.b.c" level="debug"/>

    <root level="debug">
        <appender-ref ref="lsAppender"/>
    </root>
</configuration>

My second reason for asking is that we recently found ourselves getting OOMEs from microservices with a 120 MB heap. Sure, those individual log messages are too big, but that's only a problem if they're being held-onto by Logback itself rather than sent off and forgotten about. It would be one less scalability concern.

Screenshot of YourKit output:

enter image description here


Solution

  • I noticed this again recently. The problem was actually nothing related to log size, or logging throughput, and there is no need to control the size of StatusManager in any way.

    The actual cause was that the Logstash Logback Encoder (version 6.4) was failing "silently" due to unencodable Markers:

    import net.logstash.logback.marker.Markers;
    import org.slf4j.Logger;
    import org.slf4j.LoggerFactory;
    import org.springframework.web.bind.annotation.ControllerAdvice;
    import org.springframework.web.context.request.WebRequest;
    import java.util.HashMap;
    
    @ControllerAdvice
    public class MyExceptionHandler {
    
        private static final Logger LOG = LoggerFactory.getLogger(MyExceptionHandler.class);
    
        public void logException(Throwable e, WebRequest request) {
            var fields = new HashMap<>();
            fields.put("Request", request);
    
            LOG.error(Markers.appendEntries(fields), "Error occurred", e);
        }
    }
    

    The WebRequest is an instance of org.springframework.web.context.request.ServletWebRequest which, when serialized to JSON by the encoder using jackson-databind, throws a StackOverflowError, which is caught and ignored by said encoder. The original log request is not flushed from the internal Logback cache.

    Given enough errors of this type, we eventually ran out of memory.