Search code examples
javaout-of-memoryactivemq-classicstomp

activemq oom after enabling stomp


After enabling STOMP protocol (before it was only default protocol enabled) on the Activemq server it started to fail with oom. I have only 1 client using STOMP. It can work for 1 week without fails or fail a day after a restart. Here is the config file:

<beans
  xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
  http://activemq.apache.org/schema/core http://activemq.apache.org/schema/core/activemq-core.xsd">

    <bean id="logQuery" class="io.fabric8.insight.log.log4j.Log4jLogQuery"
          lazy-init="false" scope="singleton"
          init-method="start" destroy-method="stop">
    </bean>

    <!--
        The <broker> element is used to configure the ActiveMQ broker.
    -->
    <broker useJmx="true" xmlns="http://activemq.apache.org/schema/core" brokerName="cms-mq" dataDirectory="${activemq.data}">

        <destinationInterceptors>
            <virtualDestinationInterceptor>
                <virtualDestinations>
                    <virtualTopic name="VirtualTopic.>" selectorAware="true"/>
                </virtualDestinations>
            </virtualDestinationInterceptor>
        </destinationInterceptors>

        <destinationPolicy>
            <policyMap>
              <policyEntries>
                <policyEntry topic=">" producerFlowControl="false">
                </policyEntry>
                <policyEntry queue=">" producerFlowControl="false">
                </policyEntry>
              </policyEntries>
            </policyMap>
        </destinationPolicy>

        <managementContext>
            <managementContext createConnector="false"/>
        </managementContext>
        <persistenceAdapter>
            <kahaDB directory="${activemq.data}/kahadb"/>
        </persistenceAdapter>
          <systemUsage>
            <systemUsage>
                <memoryUsage>
                    <memoryUsage percentOfJvmHeap="70" />
                </memoryUsage>
                <storeUsage>
                    <storeUsage limit="4 gb"/>
                </storeUsage>
                <tempUsage>
                    <tempUsage limit="4 gb"/>
                </tempUsage>
            </systemUsage>
        </systemUsage>

        <transportConnectors>
            <transportConnector name="auto" uri="auto+nio://0.0.0.0:61616?maximumConnections=1000&amp;auto.protocols=default,stomp"/>
        </transportConnectors>
        <shutdownHooks>
            <bean xmlns="http://www.springframework.org/schema/beans" class="org.apache.activemq.hooks.SpringContextHook" />
        </shutdownHooks>

        <plugins>
          ... security plugins config...
        </plugins>

    </broker>

    <import resource="jetty.xml"/>

</beans>

start args:

/usr/java/default/bin/java -Xms256M -Xmx1G -Dorg.apache.activemq.UseDedicatedTaskRunner=false -XX:HeapDumpPath=/var/logs/heapDumps -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.management.jmxremote.port=8162 -Dcom.sun.management.jmxremote.rmi.port=8162 -Dcom.sun.management.jmxremote.password.file=/opt/apache-activemq-5.13.0//conf/jmx.password -Dcom.sun.management.jmxremote.access.file=/opt/apache-activemq-5.13.0//conf/jmx.access -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote -Djava.awt.headless=true -Djava.io.tmpdir=/opt/apache-activemq-5.13.0//tmp -Dactivemq.classpath=/opt/apache-activemq-5.13.0//conf:/opt/apache-activemq-5.13.0//../lib/ -Dactivemq.home=/opt/activemq -Dactivemq.base=/opt/activemq -Dactivemq.conf=/opt/apache-activemq-5.13.0//conf -Dactivemq.data=/opt/apache-activemq-5.13.0//data -jar /opt/activemq/bin/activemq.jar start

UPD: From Eclipse MemoryAnalizer:

Leak Suspects
247,036 instances of "org.apache.activemq.command.ActiveMQBytesMessage", loaded by "java.net.URLClassLoader @ 0xc02e9470" occupy 811,943,360 (76.92%) bytes. 

81 instances of "org.apache.activemq.broker.region.cursors.FilePendingMessageCursor", loaded by "java.net.URLClassLoader @ 0xc02e9470" occupy 146,604,368 (13.89%) bytes.

UPD: Before having OOM error there are several error in the log like the following:

| ERROR | Could not accept connection from null: java.lang.IllegalStateException: Timer already cancelled. | org.apache.activemq.broker.TransportConnector | ActiveMQ BrokerService[cms-mq] Task-13707
| INFO  | The connection to 'null' is taking a long time to shutdown. | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[cms-mq] Task-13738

Would appriciate any help in debugging it. Can provide additional info if needed.


Solution

  • I examined the client code (ruby stomp client) and it turned out that there was 'activemq.subscriptionName' sub header without 'client-id' connect header. When you set 'activemq.subscriptionName' subscription header that means that you want your subscriber to be durable. But you should also set 'client-id' connect header because otherwise it is autogenerated. When 'client-id' header is not set we have a situation in which the broker can't identify the stomp client by it's client id when it reconnects. As a result there were a lot of Offline Durable Topic Subscribers and messages were piling up for every client-id => OOM error.