Search code examples
pythonmysqljoinsqlalchemy

sub-user join with sqlalchemy


I am trying to find a good solution for the following problem

  • There are main-users and sub-users which identify themselves via a parent_user_id.
  • All users also get additional information from the company table

I need to join sub-users to the main-users to get the company information, as well as doing the same for main-users

Users (contains sub and main users)

  • user_id
  • login
  • parent_user_id
  • company_id

Company (contains company information for all users of this company)

  • company_id
  • contact_address

when I search for the user with id 7, it should return the login and the contact address of this user. This is relatively easy because I simply join from Users to Company directly. However, this user 7 can also have a sub-user, which has user_id 11 and parent_user_id 7.

I first need to join from the parent_user_id 7 to the user 7, and then join from that user's company_id to the Company table to get the contact address.

The challenge here is that a main-user will have a parent_user_id of "NULL", therefore it cannot join from it's parent_user_id to its own user_id. This only works for sub-users.

What I did is a join( or_( ) ) for both parent's company_id as well as the user's company_id (after joining all the users to their parents).

Is this the most common solution for this problem? it seems a bit lengthy, requires aliases, outer joins and an or_ join to allow the two different types of users

user = Users.alias("user")
parent = Users.alias("parent")
company = Company.alias("company")

joined = user.join(parent, user.c.parent_user_id == parent.c.id, isouter=True)
joined = joined.join(company, or_(user.c.company_id == company.c.company_id, parent.c.company_id == company.c.company_id)

sql = select(user.c.id, user.c.login, company.c.contact_address).select_from(joined)

Maybe there is a "pattern" for this sort of problem? I have looked at reverse cte, but I'm not even sure if it applies to this case. There are only sub-users and no deeper nesting (and no "levels" of depth). Main-users simply have no parent_user_id

Edit: Ideally, I am looking for an SQLAlchemy core answer that doesnt use classes, therefore it only uses the metadata from the database using autoload


Solution

  • As commented, you solution is good enough, I think.

    However, I usually try to "hide" implementation details for such tricky situations by encapsulating them into the responsible data class itself. In this case I would add a Hybrid Attribute to the User class like so:

    class User(...):
       ...
    
        # relationships
        parent_user = relationship("User", remote_side=[user_id], backref="subusers")
        company = relationship(Company, backref="users")
    
        @hybrid_property
        def company_contact_address(self):
            company = self.company or self.parent_user.company
            return company.contact_address
    
        @company_contact_address.expression
        def company_contact_address(cls):
            PU = aliased(User)
            user_id = case((cls.company_id.is_(None), cls.parent_user_id), else_=cls.user_id)  # sqlite, universal?
            return (
                select(Company.contact_address.label("contact_address"))
                .join(PU, and_(Company.company_id == PU.company_id, PU.user_id==user_id))
                .scalar_subquery()
            )
    

    in which case the usage becomes very clear:

    query = select(User.user_id, User.login, User.company_contact_address)
    for _row in session.execute(query):
        print(_row)