Search code examples
rabbitmqamqpspring-amqp

Spring-amqp listen treads exits for unknown reason. ( reason: Attempt to use closed channel)


We are running a number of services using rabbitmq as messaging middleware. While services are operating we see services stop listening to requests, requiring us to restart the service to get it working again. We run multiple services on the same JVM (Oracle Weblogic application server), but not all services stop, usually only one, but not always the same one.

It does not seem to be load related, as the services have a very different load profile.

We set up the heartbeat protocol, but it did not solve the problem. Since the issue also occurs in our test environment, we set log level to debug, hoping this would reveal the cause and hint to a solution.

Regrettably, it did not or we lack understanding of what is happening (not too familiar with spring-amqp and amqp library, but we read the spring documentation).

We see the following exception in the logs, and after that nothing happens until we restart the service:

;2015-05-08 00:35:54,578; DEBUG; servicename=; clusterid=; username=; msguuid=; org.springframework.amqp.rabbit.core.RabbitAdmin - Declarations finished
;2015-05-08 00:37:12,015; ERROR; servicename=; clusterid=; username=; msguuid=; nl.pharmapartners.amqp.connectors.AmqpMessageListenerContainer - Consumer received fatal exception on startup
;org.springframework.amqp.rabbit.listener.FatalListenerStartupException: Authentication failure
    at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:367)
    at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:963)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.springframework.amqp.AmqpAuthenticationException: com.rabbitmq.client.PossibleAuthenticationFailureException: Possibly caused by authentication failure
    at org.springframework.amqp.rabbit.support.RabbitExceptionTranslator.convertRabbitAccessException(RabbitExceptionTranslator.java:57)
    at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:195)
    at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:359)
    at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createBareChannel(CachingConnectionFactory.java:309)
    at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.getCachedChannelProxy(CachingConnectionFactory.java:283)
    at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.getChannel(CachingConnectionFactory.java:276)
    at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.access$600(CachingConnectionFactory.java:69)
    at org.springframework.amqp.rabbit.connection.CachingConnectionFactory$ChannelCachingConnectionProxy.createChannel(CachingConnectionFactory.java:614)
    at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils$1.createChannel(ConnectionFactoryUtils.java:85)
    at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.doGetTransactionalResourceHolder(ConnectionFactoryUtils.java:134)
    at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.getTransactionalResourceHolder(ConnectionFactoryUtils.java:67)
    at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:363)
    ... 2 more
Caused by: com.rabbitmq.client.PossibleAuthenticationFailureException: Possibly caused by authentication failure
    at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:373)
    at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:516)
    at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:545)
    at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:191)
    ... 12 more
Caused by: com.rabbitmq.client.AlreadyClosedException: clean connection shutdown; reason: Attempt to use closed channel
    at com.rabbitmq.client.impl.AMQChannel.ensureIsOpen(AMQChannel.java:190)
    at com.rabbitmq.client.impl.AMQChannel.rpc(AMQChannel.java:223)
    at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:209)
    at com.rabbitmq.client.impl.AMQChannel.rpc(AMQChannel.java:202)
    at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:355)
    ... 15 more
2015-05-08 00:37:12,032; ERROR; servicename=; clusterid=; username=; msguuid=; nl.pharmapartners.amqp.connectors.AmqpMessageListenerContainer - Stopping container from aborted consumer

We have no real clue what happened, before this we see the exhanges, queues and bindings being redeclared about every minute. This log occurs about 1,5 minutes after the redeclarations finished. There is no logging which indicates a problem with rabbitmq, so we have no clue about the reported attempt to use an already closed connection. This is all from the amqp-client or spring-amqp libraries.

I have not been able to find related error logging on the rabbit cluster, but I only looked in /var/log/rabbitmq directory.

The log seems to indicate an authentication issue, but other services running on the same node using the same authentication parameters have no problem (and the service itself has been running quite a while (up to days) without any problem as well. The original exception indicates the library obtains a new channel, but it is already closed before it can be used.

Can somebody explain what is happening and possibly how to fix this ?

We are running RabbitMQ 3.3.3/Erlang 17 with Spring amqp version 1.3.2, amqp-client 3.2.4.

Additional information: We are configuring various components with defaults, with the following exceptions:

#The waiting time (ms) for a response in amqp (for rpc calls)
amqp.timeout=30000
prefetch.count=1
max.concurrent.consumers=5
start.consumer.min.interval=500
stop.consumer.min.interval=5000
consecutive.active.trigger=3
consecutive.idle.trigger=3
amqp.message.ttl=900000
amqp.heartbeat=5

RabbitMQ has a policy that will remove queues that have no consumers for more than 120 seconds. All queues are mirrored to all nodes in the rabbit cluster (3 nodes).

We have figured out we use a CachingConnectionFactory, with its default cache size of 1 (we are going to change this as this does not seem appropiate for 5 concurrent consumers, we think this should be at least equal to the max consumers).

Thanks in advance,

Wim Veldhuis.


Solution

  • As I said in the rabbitmq-users group, this "channel" is an internal special channel used to authenticate a new connection.

    Caused by: com.rabbitmq.client.PossibleAuthenticationFailureException: Possibly caused by authentication failure
        at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:373)
        at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:516)
        at com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:545)
        at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:191)
        ... 12 more
    Caused by: com.rabbitmq.client.AlreadyClosedException: clean connection shutdown; reason: Attempt to use closed channel
        at com.rabbitmq.client.impl.AMQChannel.ensureIsOpen(AMQChannel.java:190)
        at com.rabbitmq.client.impl.AMQChannel.rpc(AMQChannel.java:223)
        at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:209)
        at com.rabbitmq.client.impl.AMQChannel.rpc(AMQChannel.java:202)
        at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:355)
        ... 15 more
    

    As you can see, we are trying to create a new connection

    com.rabbitmq.client.ConnectionFactory.newConnection(ConnectionFactory.java:545)
        at     org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:191)
    

    Presumably because the existing connection closed for some reason.

    And this exception occurred during the authentication; Spring AMQP never regained control.

    Again, as I said on the group, you need to look at the rabbitmq-server log to see if there any clues there.

    Caching is irrelevant for consumer channels - they are not eligible for caching anyway (since 1.1.3).

    EDIT:

    We changed authentication errors to be fatal after user requests, in 1.2.1 as a result of user requests (no point in keep trying if the credentials are bad).

    However, it appears there is a race condition in the rabbitmq-client in that if a connection is closed right after it opens (before authentication), it is reported as a PossibleAuthenticationFailureException.

    We could add an option to treat PossibleAuthenticationFailureException as non-fatal (and keep trying). Feel free to open up a JIRA issue and we'll take a look.