Search code examples
validationdomain-driven-designabstractionconsistencydata-integrity

How to ensure data integrity with domain that change


I'm working on a project where I applied DDD principles.

In order to ensure domain integrity I validate each domain model (entities or value objects) on creation.

Example of the user entity:

class User {
    constructor(opts) {
       this.email = opts.email;
       this.password = opts.password;
       this.validate();
    }

    validate() {
      if(typeof this.email !== 'string') {
        throw new Error('email is invalid');
      }

      if(typeof this.password !== 'string') {
        throw new Error('password is invalid');
      }
    }
}

The validate method is stupid implementation of validation (I know I should verify email using Regex and I handle the error in a most effective way).
This model is then persisted using the the userRepository module.

Now, imagine I want to add a new property username to my user model, my validate method will look like this:

validate() {
  if(typeof this.email !== 'string') {
    throw new Error('email is invalid');
  }

  if(typeof this.password !== 'string') {
    throw new Error('password is invalid');
  }

  if(typeof this.username !== 'string') {
    throw new Error('username is invalid');
  }
}

The problem is that old user models stored will not have the username property which is now required. Therefore when I'll fetch data from database and try to construct model it'll throw an error.

To fix this problem I see multiple solutions (but none seems good to me):

  • create an anti-corruption layer in the user repository (create default username if not defined)
  • Allow invariant in my domain model (username is not required)
  • Use cron-services that update database entities based on the domain change (again set default username)

Solution

  • The problem is that old user models stored will not have the username property which is now required.

    Yup, that's a problem.

    Here's how I think about it -- the persisted copy of your domain model is a message, sent by an instance of your domain model running in the past to an instance of your domain model running in the future.

    If you want those messages to be compatible, then you need to accept certain constraints in the design of your message schema.

    One of those constraints is that you don't add new required fields to an existing message type.

    Adding optional fields is fine, because systems that don't care can ignore the optional fields, and the systems that do care can provide a default value of when the field is missing.

    But if you need to add a new required field, then you create a new message.

    The event sourcing community has to worry about this sort of thing a lot (events are messages); Greg Young wrote Versioning in an Event Sourced System, which has good lessons on the versioning of messages.

    To fix this problem I see multiple solutions (but none seems good to me)

    I agree, these are all kind of lousy - in the sense that they are all introducing a mechanism for deriving a "default" user name where none exists. That being the case, the field is effectively optional; so why claim that it is required?

    In a situation where the field isn't required, but you want to stop accepting new data that doesn't include this field -- you probably want to put new validation on the data input code path. That is to say, you can create a new API with messages that require the field, validate those messages, and then use the domain model with the optional field to store and fetch the data.

    So adding a new required field is an anti-pattern in DDD

    Adding new required fields is an anti-pattern in messaging; DDD has little to do with it.

    You shouldn't be expecting to be able to add required fields to existing schema in a backwards compatible way. Instead, you extend the message schema by introducing a new message in which the field is required.

    I thought applying DDD principles help to handle the business logic complexity and also help to design evoluting software and evoluting domain models

    It does, but it isn't magic. If your new models aren't backward compatible with the old models, then you are going to have to manage that change in some way

    You might declare bankruptcy, and simply forget all previous history. You might migrate your existing data to the new data model. You might maintain the two different data models in parallel.

    In other words, backwards compatibility is a long term concern that you should be thinking about as you design your solution.