Search code examples
javamicroservicescqrsevent-sourcingaxon

Why does the RetryScheduler in Axon Framework not retry after a NoHandlerForCommandException?


so I have a Saga and the Saga sends a command to a different microservice on a specific event. I wanted to configure the commandGateway with a RetryScheduler, so that it retries to send the command in case that the other microservice is down. The RetryScheduler will only perform retries if the exception is a RuntimeException, which the NoHandlerForCommandException that is thrown when the other service if offline definately is.

If i dont set the maxRetryCount then the error message is
o.a.c.gateway.IntervalRetryScheduler : Processing of Command [XXXCommand] resulted in an exception 1 times. Giving up permanently

If I do set the attribute the error message is
o.a.c.gateway.IntervalRetryScheduler : Processing of Command [XXXCommand] resulted in an exception and will not be retried

If the other microservice is running, then the command is handled correctly, no problems.

Does anybody have an idea what could be the issue?

This is my configuration for the commandGateway with a RetryScheduler:

@Bean
public CommandGateway commandGateway(){

    Configurer configurer = DefaultConfigurer.defaultConfiguration();

    CommandBus commandBus = configurer.buildConfiguration().commandBus();

    ScheduledExecutorService scheduledExecutorService = Executors.newScheduledThreadPool(1);
    RetryScheduler rs = IntervalRetryScheduler.builder().retryExecutor(scheduledExecutorService).maxRetryCount(100).retryInterval(1000).build();
    CommandGateway commandGateway = DefaultCommandGateway.builder().commandBus(commandBus).retryScheduler(rs).build();

    return commandGateway;
}

Solution

  • To resolve the problem at hand, you could provide your own implementation of the IntervalRetryScheduler, which overrides the IntervalRetryScheduler#isExplicitlyNonTransient(Throwable) method to also take into account retrying the NoHandlerForCommandException.

    Note though that the IntervalRetryScheduler is intentionally only retrying on exceptions which are of type AxonNonTransientException, is those typically signal recoverable exceptions. The NoHandlerForCommandException means that the CommandBus implementation being used has no clue whom to give the command too, something which in general suggests an thing which cant be retried.

    It however seems you have a scenario where it does make sense. Thus, like I point out at the start, overriding the isExplicitlyNonTransient(Throwable) method to exclude NoHandlerForCommandException would be the way to go for you I think.

    Hope this helps you out!