domain-driven-design repository-pattern unit-of-work aggregateroot onion-architecture

DDD: Aggregate design - Referencing between aggregates

I have an issue with how to design aggregates.

I have Company, City, Province and Country entities. Each of these needs to be an aggregate root of its own aggregate. The City, Province and Country entities are used throughout the system and are referenced by many other entities, so they are not value objects and also need to be accessed in many different scenarios. So they should have repositories. A CityRepository would have methods such as FindById(int), GetAll(), GetByProvince(Province), GetByCountry(Country), GetByName(string).

Take the following example. A Company entity is associated with a City, which belong to a Province which belongs to a Country:

Aggregate Roots

Now let's say we have a company listing page which lists some companies with their city, province and country.

Reference by ID

If an entity needs to reference a City, Province or Country, they would do so by ID (as suggested by Vaughn Vernon).

In order to get this data from the repositories, we need to call 4 different repositories and then match up the data in order to populate the view.

var companies = CompanyRepository.GetBySomeCriteria();
var cities = CityRepository.GetByIds(companies.Select(x => x.CityId);
var provinces = ProvinceRepository.GetByIds(cities.Select(x => x.ProvinceId);
var countries = CountryRepository.GetByIds(province.Select(x => x.CountryId);

foreach(var company in companies)
{
    var city = cities.Single(x => x.CityId == company.CityId);
    var province = provinces.Single(x => x.ProvinceId == city.ProvinceId);
    var country = countries.Single(x => x.CountryId == province.CountryId);

    someViewModel = new CompanyLineViewModel(company.Name, city.Name, province.Name, country.Name);
}

This is a very bulky and inefficient, but apparently the 'correct' way?

Reference by Reference

If the entites were referenced by reference, the same query would look like this:

var companies = CompanyRepository.GetBySomeCriteria();
someViewModel = new CompanyLineViewModel(company.Name, company.City.Name, company.Province.Name, company.Country.Name);

But as far as I understand, these entities cannot be referenced by reference as they exist in different aggregates.

Question

How else could I better design these aggregates?

Could I load company entities with the city model even when they exist in different aggregates? I imagine this would soon break the boundaries between aggregates. It would also create confusion when dealing with transactional consistency when updating aggregates.

Solution

Dennis Traub has already pointed out what you can do to improve query performance. That approach is much more efficient for querying, but also even more bulky, because you now need additional code to keep your view model in sync with the aggregates.

If you don't like that approach or cannot use it for other reasons, I don't think that the first approach that you are suggesting is more ineffective or bulky than using direct object references. Suppose for a moment that you were using direct object references in the aggregates. How would you persist those aggregates to durable storage? The following options come to mind, when you are using a database:

If you are using a denormalized table for Company (e.g., with an document database such as MongoDB), you are effectively optimizing for a view query already. However, you'll need all the extra work to keep your Company table in sync with City, Province. Efficient, but bulky, and you might consider persisting the real view models instead (one per use-case).
If you are using normalized tables with a relational database, you would use foreign keys in the Company table to reference the respective City, Province etc. by their id. When querying for a Company, in order to retrieve the fields of City, Province etc that are needed to populate your view model, you can either use a JOIN over 4+ tables, or use 4 independent queries to the City, Province, ... tables (e.g., when using lazy loading for the foreign key references).
If you are using normalized tables in a non-relational database, usually people use application side joins exactly as in the code you suggested. For some databases, ORM tools such as Morphia or Datanucleus can save you some programming work, but under the hood, the independent queries remain.

Therefore, in the 2nd and 3rd option, you save a bit of trivial programming work if you let an ORM solution generate the database mapping for you, but you don't get much improved efficiency. (JOINs can be optimized by proper indices, but getting this done right is non-trivial).

However, I'd like to point out that you remain full control over the view model object construction and database queries when you are referencing by Id and using a programmatic application side joins as in the code that you suggested. In particular, names of cities, provinces etc are usually changing very seldomly and there are only few of them and they easily fit into the memory. Hence you can make extensive use of in-memory caching for the database queries -- or even use in-memory-repositories that are populated from flat-files on application startup. When done right, to construct your view model for Company, only one database call to the Company table is required, and the other fields are retrieved from the in-memory cache/repository, which I would consider extremely efficient.