Search code examples
language-agnosticarchitecturedomain-driven-designrepository-patternddd-repositories

Repository pattern: repository per aggregate or per underlying data store?


It is recommended to have one repository per aggregate.

However, I have a case where the same aggregate object can be fetched from 2 heterogeneous data stores. For the background, that object is:

  1. fetched from data store A (remote & read-only)
  2. presented to the user for validation
  3. on validation, imported into data store B (local & read-write)
  4. it can be fetched from and modified in data store B

Obviously (or not), I can't have a unique aggregate repository for that - at some point I need to know from which data store the object is fetched.

Given that the domain layer should ignore the infrastructure, my particular case breaks somehow my understanding of how the repository pattern and DDD in general should be properly implemented.

Did I get something wrong?


Solution

  • Did I get something wrong?

    Seems to me what you got wrong is having two data stores for the same data.

    If indeed there's a good reason for this redundancy, the two aggregates must be different in some way, and that might justify considering them as separate aggregates and having two repositories.

    If you want to treat them as a single aggregate, a single repository should know how to disambiguate and deal with the correct datastore, but encapsulate that knowledge of datastores away from your domain model.

    EDIT:

    In the situation as explained in comments, where one datastore is read-only and the other a local modifiable copy, having two datastores is in fact forced on you. Your repository needs to know about both datastores and use the remote read-only store only if it does not find something locally. Immediately upon retrieving something from the remote, it should save it to the local and thereafter use the local copy.

    This logic is sort of a caching proxy, but not exactly, as the remote is read-only and the local is read-write. It might contain enough logic to be extracted to a service used by the repository, but shouldn't be exposed to the domain.

    This situation also has some risks that you need to think about. Once you've saved something locally, you have two versions of the same data, which will get out of synch. What do you do if someone with write access on the remote changes it after you've changed your local?