Search code examples
pythondomain-driven-designrepository-pattern

Prevent exception-based control flow in upper layer of Repository pattern


Although I'm using Python here, my question is not related to a particular language. I omitted many implementation details for brevity, it is almost pseudocode at this point; and I'm not tied to a particular framework or ORM. Also, since the Repository pattern is often related to DDD, I'm using DDD terms. I'm not claiming to fully comprehend DDD. My project is rather simple and DDD may not be suited at all, but it was an opportunity to learn about it. And DDD or not, I'd still probably end up with similar layers: models, model instances reconstitution from the database, and business logic.


In a recent project, I had to add a business (and conceptual object-oriented) view on top of a raw SQL schema (which cannot change for legacy reasons; it comes from an open-source tool widely deployed). That "business layer" also has to implement some trivial logic such as:

  • a User cannot be created if it already exists
  • a User cannot be added to a non-existing Group
  • a Group cannot be deleted if it still contains users
  • the deletion of a User will delete on cascade some user-related Items

I came across the Repository pattern: in my case—and by definition—it would help to emulate the conceptual view built on top of the raw database schema. Here, I can reconstitute objects (Aggregate Roots as per the DDD) taken from the database and I can add or remove them from the collection, hiding the actual database machinery.

So I ended up having a UserRepository:

# User model (aggregate root)
class User:
  def __init__(self, name: str, groups: List[Group] = [], items: List[Item] = []):
    # some invariant
    if not (groups or items):
      raise ValueError("A user must have at least one item or must belong to at least one group")
    self.name = name
    self.groups = groups
    self.items = items

# User repository
class UserRepository:
  def __init__(self, db_conn):
    self.db_conn = db_conn
  
  def exists(self, username: str) -> bool: # ...
  def find_all(self) -> list[User]: # ...
  def find_one(self, username: str) -> User:
    if not self.exists(username):
      return None
    groups = self.db_conn.execute(f"SELECT FROM usergroup WHERE ...")
    items = self.db_conn.execute(f"SELECT FROM useritem WHERE ...")
    return User(username, groups, items)
  
  def add(self, user: User):
    for group in user.groups:
      self.db_conn.execute(f"INSERT INTO usergroup ...") # insert group belongings
    for item in user.items:
      self.db_conn.execute(f"INSERT INTO useritem ...") # insert user items
    
  def remove(self, username: str):
    self.db_conn.execute(f"DELETE FROM usergroup WHERE ...") # delete group belongings
    self.db_conn.execute(f"DELETE FROM useritem WHERE ...") # delete items

From the database point of view, the User object actually exists through several tables (say two tables, to simplify). It's why I found the Repository pattern useful here: I hide these details in the repository which has the responsibility to reconstitute the objects taken from the database and to flatten them in some way during database insertion.

Now what about the user logic? As per the DDD, it lies in a UserService:

class UserAlreadyExistsException(Exception): # ...
class UserNotFoundException(Exception): # ...
class GroupNotFoundException(Exception): # ...

class UserService:
  def __init__(self, user_repo: UserRepository, group_repo: GroupRepository):
    self.user_repo = user_repo
    self.group_repo = group_repo
  
  def get(self, username: str) -> User:
    user = self.user_repo.find_one(username)
    if not user:
      raise UserNotFoundException
    return user

  def create(self, user: User) -> User:
    if self.user_repo.exists(user.name):
      raise UserAlreadyExistsException
      
    for group in user.groups:
      if not self.group_repo.exists(group.name):
        raise GroupNotFoundException(f"'{group.name}' has not been found")
    
    self.user_repo.add(user)
    return user
  
  def delete(self, username: str):
    if not self.user_repo.exists(username):
      raise UserNotFoundException
    self.user_repo.remove(username)

This service makes use of both user and group repositories. It contains the business logic I mentioned at the beginning. My concern, and you may see it coming, is about the raised exceptions: would that imply exception-based control flow in the upper application layer? If so, how to prevent it?

I have two apps which in turn make use of this service: a Web app (REST API) and a Python app. They will consume the service and trust it so that business rules are ensured (no duplicated users, no users created in non-existing groups, etc.).

# Web app consuming UserService

@router.get("/users/{username}", status_code=200)
def get_user(username: str) -> User:
  try:
    return user_service.get(username)
  except UserNotFoundException:
    raise HTTPException(404, "Given user does not exist")

@router.post("/users", status_code=201)
def post_user(user: User) -> User:
  try:
    return user_service.create(user)
  except UserAlreadyExistsException:
    raise HTTPException(409, "Given user already exists")
  except GroupNotFoundException as exc:
    raise HTTPException(422, str(exc))

@router.delete("/users/{username}", status_code=204)
def delete_user(username: str):
  try:
    user_service.delete(username)
  except UserNotFoundException:
    raise HTTPException(404, detail="Given user does not exist")

But I end up with this exception-based control flow, which I don't like and which is known to be an anti-pattern.

Looking at this popular DDD example project in Java, it shows the same issue: the CustomerService throws an exception, which is caught and re-raised in the associated web service. Someone did ask about this but did not get an answer.

An alternative would be to re-implement the logic in my apps, but this lead to multiple problems:

  • the UserService will have to replicate the UserRepository operations to make them accessible
  • the domain logic will leak into the application layer
  • the domain logic will be duplicated into each application

EDIT

  • This article from 2019 states the problem very well and also describes an alternative
  • This article from 2023 also suffers from exception-based control flow:
class PaymentController {

    public Response handlePayment(Request request, int amountInCents) {
        try {
          this.paymentService.createPaymentForUser(request.getUser(), amountInCents);
          return new Response(200, new HashMap<>());
        } catch (PaymentValidationException ex) {
          throw new ResponseStatusException(400, ex.getMessage())
        } catch (UserNotFoundException ex) {
          throw new ResponseStatusException(401, "You are not authorized to create payments for this user");
        }
    }
}

This is very similar to my question actually. This might interest some readers, though it does not mention how to prevent it or existing alternatives.


Solution

  • My concern, and you may see it coming, is about the raised exceptions: would that imply exception-based control flow in the upper application layer? If so, how to prevent it?

    One answer is to apply the Template Method. For example, we might choose to implement a "Create User Service" like so:

    class UserService:
      def __init__(self, user_repo: UserRepository, group_repo: GroupRepository):
        self.user_repo = user_repo
        self.group_repo = group_repo
    
      def create(self, user: User) -> User:
        if self.user_repo.exists(user.name):
          return self.user_already_exists(user)
          
        for group in user.groups:
          if not self.group_repo.exists(group.name):
            return self.group_not_found(group)
        
        self.user_repo.add(user)
        return self.user_added(user)
    

    The point being that we now have three different "seams" (see Michael Feathers, Working Effectively with Legacy Code) that we can use to customize the behaviors that we want.

    If we were going to emulate your original design, then we might have...

      def user_already_exists(self user: User):
        raise UserNotFoundException
    
      def group_not_found(self, group: Group):
        raise GroupNotFoundException(f"'{group.name}' has not been found")
    
      def user_added(self, user: User):
        return user
    

    But we might imagine replacing these implementations like so:

      def user_already_exists(self user: User):
        return HTTPException(409, "Given user already exists")
    
      def group_not_found(self, group: Group):
        raise HTTPException(422, str(GroupNotFoundException(f"'{group.name}' has not been found")))
    
      def user_added(self, user: User):
        return user
    

    (These alternative implementations might be provided using inheritance, or with some minor changes to the design you could use composition instead. Or, if passing callbacks around is more familiar, then you could do that. You get to pick your trade offs.)

    Another alternative, which would be more familiar in a "functional programming" context, would be to use "OR types". Scott Wlaschin's work on Domain Modeling Made Functional would be a reasonable idea. But the basic sketch would be that you would explicitly return a value that represents one of a number of alternatives, which we might obtain by invoking the appropriate class method / "named constructor":

      def user_already_exists(self user: User):
        return MagicOrType.user_already_exists(user)
    
      def group_not_found(self, group: Group):
        raise MagicOrType.group_not_found(group)
    
      def user_added(self, user: User):
        return MagicOrType.user_added(user)
    

    And then the code receiving the or-value would be responsible for choosing the appropriate branches to take.

    Again, "just" a different set of trade offs.


    Note that we might not make the same trade offs everywhere: for example, db_conn::execute might fail for reasons completely unrelated to the "domain logic". Should that be another branch that we explicitly handle, or should the code an exception that propagates all the way up the call stack? Does that depend on the context that the repository is running in? Do we want transient errors handled differently than permanent ones? etc.