java oop design-patterns continuous-deployment state-pattern

Best way to apply versioning to State object-oriented design pattern?

Main question

What are some good ways to manage state machine versions (including states themselves, state transition methods, etc) in object-oriented design, particularly during deployments when multiple versions are available?

Context/application

My team is working on re-architecting a back-end microservice with better OOP design, and we've opted to use the State pattern to model the various parts of the workflow. Execution of work in different states can often be separated by long periods of time as workflow execution is event-driven and sometimes can involve calling external services -- in other wrods: in one state, we'll call a dependency and wait until we get an asynchronous response/notification before moving to and executing work in the next state.

As such, there is a chance that we can run into a "mixed fleet" issue during software deployments when we update the state machine, the transition methods, etc. For example, suppose the "v1" state machine transitions looks like:

Start -> File Retrieved -> Hashing Complete -> File Published   -> Finished

and then suppose I want to add a new state to remove personally identifiable information (PII), which would go between File Retrieved and Hashing Complete and change the state transitions out of/into those states, respectively. The "v2" state machine transitions would now look like this:

Start -> File Retrieved -> PII Removed      -> Hashing Complete -> File Published -> Finished

During deployment of this change, I want to make sure that any requests that start with the OLD/"v1" state machine workflow continue to use that workflow, rather than transitioning to the NEW/"v2" state machine workflow as it rolls out to the fleet.

My current approach is to store a "version" property/field in the database records with each request, so that it's easy to figure out with which state machine version the request was initiated. Right now, I am stuck figuring out the best way to model the versions in code. Some ideas I had:

Use separate Java packages for different versions. Use a Factory pattern or something that can instantiate state objects based on whatever version number is provided. Keeps state machine implementations fully isolated, but at the cost of an explosion of classes.
Integrate a "version" property/field into the state classes directly. Not sure how hard this would be to manage over time, and code might get ugly.
Something else?

Open to any suggestions. We've also considered some alternative design patterns (Pipeline, CoR, etc) so open to exploring any of those as well.

For additional context: We're using the State pattern to make the workflows idempotent, abstract the business logic from orchestration, and (most importantly) make it extremely easy for future developers to see exactly where business logic SHOULD live and how to add to/modify the workflow going forward.

Thanks!

Solution

I think you're making a mistake. A lot of times, this "mixed fleet" issue is going to be a feature, not a bug. In addition, this notion of "version" that you want to add creates complexity and restrictions, but adds very little value.

I think you just need to change your idea of what a state is. It's not "a step in the workflow". It's the entire remainder workflow process. That remaining process may involve multiple steps or it may not, but it's up to the state implementation to decide.

In a typical implementation of this kind of workflow system, there is a persisted "current state". When an event occurs, the current state is read from the DB and hydrated somehow into an implementation object. A handler in that object is then called to perform the appropriate action, and optionally return a new current state definition to be persisted.

The key to making a system like this comprehensible and maintainable is the design of the persisted state representation. This representation is the language that a state implementation uses to define what happens next. That language should be expressive enough to define any remaining process that might be necessary, and also simple, straightforward, and unambiguous, so that there is no confusion about how the serialized definition of "what happens next" turns into an implementation.

Your version idea ruins the "simple, straightforward" part, making it more difficult to figure out what's going to happen next in any workflow.

To make this concrete, lets consider your example...

In version 1, the start state transitions to the "RetrieveAndPublishFile" state. That state then transitions to the "PublishFileData" state. See how the name reflects the whole remainder of the process?

In version 2, you change the "RetrieveAndPublishFile" implementation so that it transitions to "CleanPiiAndPublish" instead of "PublishFileData". The "CleanPiiAndPublish" state cleans out the PII and then transitions to the "PublishFileData" state that you already have.