My application consumes some messages from RabbitMQ and processes them. I have about 10 queues and each queue has up to ten consumers (threads). I have a prefetch of 5. I'm running my setup in Heroku using the CloudAMQP plugin (RabbitMQ as a service).
I'm running with the default heartbeat and connection timeout settings (60 seconds).
My java application is a spring boot application using the spring-rabbit library.
Versions:
RabbitMQ 3.5.3
Erlang 17.5.3
Java 1.8
Spring boot 1.3.2.RELEASE
Spring rabbit 1.5.3.RELEASE
The problem is that for the consumers of one particular queue stop consuming messages after some time. When I restart my java application everything works fine. The other queues are being consumed normally though. No errors on the application's side. On the log stream of rabbit's side I see some entries like
= REPORT==== 2016-08-02 15:53:32 UTC ===
closing AMQP connection <SOMETHING> (SOMETHING_ELSE -> SOMETHING_ELSE_ELSE):
{heartbeat_timeout,running}
I can't reproduce locally or in a testing environment in Heroku.
Update
The code below can be found in AMQConnection.class
int heartbeat = negotiatedMaxValue(this.requestedHeartbeat,
connTune.getHeartbeat());
private static int negotiatedMaxValue(int clientValue, int serverValue) {
return (clientValue == 0 || serverValue == 0) ?
Math.max(clientValue, serverValue) :
Math.min(clientValue, serverValue);
}
I can't increase the value of the heartbeat above 60 seconds (which is what I'm getting from the server).
Unfortunately, this seems like a networking issue. This could be due to a few things:
One way to force all of your dynos to restart is to run $ heroku ps:restart
. This will force Heroku to restart your dynos, which frequently means moving them to a new EC2 host. This may help if it is a one off issue.