Search code examples
microservicesaxonsaga

Orchestrating saga does not "survive" service failure


Assume you have two different microservices (Customer and Account) both running as a Spring Boot application in a Docker container. Each time a new customer is created, a corresponding account should be created as well. To orchestrate this flow, I have a third "service" implementing orchestration-based saga logic.

The saga "service" contains the following code.

@Saga
public class CustomerAccountSaga {

    private static final String ACCOUNT_CREATION_DEADLINE = "sagas.account-creation-deadline";

    private static final Logger LOGGER = LogManager.getLogger(CustomerAccountSaga.class);


    @Autowired
    private transient CommandGateway commandGateway;

    @Autowired
    private transient DeadlineManager deadlineManager;


    private String customerId;

    private String accountDeadlineId;



    @StartSaga
    @SagaEventHandler(associationProperty = "aggregateId")
    private void on(CustomerCreatedEvent event) {
        if(LOGGER.isDebugEnabled()) {
        LOGGER.debug("A new customer has been created for id = '{}'", event.getAggregateId());
        }
    
        this.customerId = event.getAggregateId().getId();
        SagaLifecycle.associateWith("customerId", customerId);
    
        //Both services has a CustomerId class defined in another package.
        b.t.c.a.v.CustomerId id = new b.t.c.a.v.CustomerId(customerId);
        CreateAccountCommand createAccount = new CreateAccountCommand(id);
        commandGateway.send(createAccount);
    
        this.accountDeadlineId = deadlineManager.schedule(Duration.ofDays(1), ACCOUNT_CREATION_DEADLINE);
    }

    @SagaEventHandler(associationProperty = "aggregateId")
    private void on(CustomerDeletedEvent event) {
        if(LOGGER.isDebugEnabled()) {
            LOGGER.debug("A customer with id '{}' has been deleted. "
                    + "The customer was deleted before the account was created, "
                    + "or the request to create the account timed-out", 
                    event.getAggregateId());
        }
        deadlineManager.cancelSchedule(ACCOUNT_CREATION_DEADLINE, accountDeadlineId);
        SagaLifecycle.end();
    }

    @SagaEventHandler(associationProperty = "customerId")
    private void on(AccountCreatedEvent event) {
        if(LOGGER.isDebugEnabled()) {
            LOGGER.debug("A corresponding account for customer with id '{}' has been created", 
                    event.getCustomerId());
        }
        deadlineManager.cancelSchedule(ACCOUNT_CREATION_DEADLINE, accountDeadlineId);
        SagaLifecycle.end();
    }

    @DeadlineHandler(deadlineName = ACCOUNT_CREATION_DEADLINE)    
    public void on() {
        if(LOGGER.isDebugEnabled()) {
            LOGGER.debug("Failed to create a new account for customer with id '{}' in a timely fashion", 
                    customerId);
        }
    
        //Both services has a CustomerId class defined in another package.
        b.t.c.c.v.CustomerId id = new b.t.c.c.v.CustomerId(customerId);
        DeleteCustomerCommand deleteCustomer = new DeleteCustomerCommand(id);
        commandGateway.send(deleteCustomer);
    }

}

When all services are up and running, everything works as expected. A CustomerCreatedEvent is handled by the saga handler and fires a CreateAccountCommand as expected. The latter results in the creation of the account and firing the AccountCreatedEvent which is also handled by the saga logic.

The problem arises when I try the following scenario's. In all cases, the customer service is running.

Scenario A

  1. Create a new customer using the customer service.
  2. Start the account service. Nothing happens as expected since the account service does not listen for any events originating from the customer service.
  3. Start the saga service. I would expect the saga service to receive a CustomerCreatedEvent which is has not processed before to orchestrate the creating of the corresponding account.

Scenario B

  1. Create a new customer using the customer service.
  2. Start the saga service. I would expect the saga service to handle the CustomerCreatedEvent but I doesn't receive the event.
  3. Start the account service. I would expect the account service to receive a CreateAccountCommand originating from the saga service but it doesn't because of step 2 (in this flow) not being executed.

Scenario C

Both the customer and account service are up and running. The saga service is offline.

  1. Create a new customer.
  2. Start the saga service. Again I would expect the saga service to pickup the CustomerCreatedEvent and proceed but it doesn't.

Since in all cases the expected behaviour does not occur, the application becomes in an inconsistent state since the existence of a customer without corresponding account is not allowed.

The saga services has a persistent store configured for MySQL both for Quartz and Axon using the following configuration. Axon uses the Jackson serializer for events.

# Persistence configuration (MySQL) 
###################################
# Quartz persistence
spring.quartz.job-store-type=jdbc
spring.quartz.jdbc.initialize-schema=always
spring.quartz.properties.org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
spring.quartz.properties.org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.StdJDBCDelegate
spring.quartz.properties.org.quartz.jobStore.dataSource=dsQuartz
spring.quartz.properties.org.quartz.jobStore.tablePrefix=QRTZ_
spring.quartz.properties.org.quartz.dataSource.dsQuartz.user = ******
spring.quartz.properties.org.quartz.dataSource.dsQuartz.password = *******
spring.quartz.properties.org.quartz.dataSource.dsQuartz.maxConnections = 10
spring.quartz.properties.org.quartz.dataSource.dsQuartz.driver = com.mysql.cj.jdbc.Driver
spring.quartz.properties.org.quartz.dataSource.dsQuartz.URL = jdbc:mysql://192.168.99.100:3306/saga-store

# Axon persistence
spring.datasource.url=jdbc:mysql://192.168.99.100:3306/saga-store
spring.datasource.driverClassName=com.mysql.cj.jdbc.Driver
spring.datasource.username=******
spring.datasource.password=*****
spring.jpa.database-platform=org.hibernate.dialect.MySQL8Dialect
spring.jpa.hibernate.ddl-auto=update

Question: Am I misunderstanding some of the basic concepts of microservices and orchestration sagas or am I overlooking something in the way I have setup/designed the different microservices containing the business logic and saga orchestration logic?

Thank you for reading my post and pointing out where I am wrong.


Solution

  • Without knowing much of your setup, I would say it all depends on configuration. So, basically when you start a @Saga, it also has a Streaming Event Processor under the hoods, which can start on the tail (oldest event) or the head (newest event).

    The default, for a Saga Streaming Processor, is the head, as stated on our ref-guide. If nothing is configured, your Saga will only react upon new Events and not Events from the past - you should be very careful and give a good thought about it before changing from one to another.

    Another important point to note is about the Saga Store which uses the InMemorySagaStore if nothing is configured. Of course this might not be ideal and you would configure a persistent one. All the pieces are available on our ref-guide once again.