Search code examples
pythonsqlalchemymany-to-manyself-reference

Syntax for late-binding many-to-many self-referential relationship


I have found many explanations for how to create a self-referential many-to-many relationship (for user followers or friends) using a separate table or class:

Below are three examples, one from Mike Bayer himself:

But in every example I've found, the syntax for defining the primaryjoin and secondaryjoin in the relationship is an early-binding one:

# this relationship is used for persistence
friends = relationship("User", secondary=friendship, 
                       primaryjoin=id==friendship.c.friend_a_id,
                       secondaryjoin=id==friendship.c.friend_b_id,
)

This works great, except for one circumstance: when one uses a Base class to define the id column for all of your objects as shown in Mixins: Augmenting the base from the docs

My Base class and followers table are defined thusly:

from flask_sqlchalchemy import SQLAlchemy
db = SQLAlchemy()

class Base(db.Model):
    __abstract__ = True
    id = db.Column(db.Integer, primary_key=True)

user_flrs = db.Table(
    'user_flrs',
    db.Column('follower_id', db.Integer, db.ForeignKey('user.id')),
    db.Column('followed_id', db.Integer, db.ForeignKey('user.id')))

But now I have trouble with the followers relationship that has served me loyally for a while before I moved the id's to the mixin:

class User(Base):
    __table_name__ = 'user'
    followed_users = db.relationship(
        'User', secondary=user_flrs, primaryjoin=(user_flrs.c.follower_id==id),
        secondaryjoin=(user_flrs.c.followed_id==id),
        backref=db.backref('followers', lazy='dynamic'), lazy='dynamic')

db.class_mapper(User)  # trigger class mapper configuration

Presumably because the id is not present in the local scope, though it seems to throw a strange error for that:

ArgumentError: Could not locate any simple equality expressions involving locally mapped foreign key columns for primary join condition 'user_flrs.follower_id = :follower_id_1' on relationship User.followed_users. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or are annotated in the join condition with the foreign() annotation. To allow comparison operators other than '==', the relationship can be marked as viewonly=True.

And it throws the same error if I change the parentheses to quotes to take advantage of late-binding. I have no idea how to annotate this thing with foreign() and remote() because I simply don't know what sqlalchemy would like me to describe as foreign and remote on a self-referential relationship that crosses a secondary table! I've tried many combinations of this, but it hasn't worked thus far.

I had a very similar (though not identical) problem with a self-referential relationship that did not span a separate table and the key was simply to convert the remote_side argument to a late-binding one. This makes sense to me, as the id column isn't present during an early-binding process.

If it is not late-binding that I am having trouble with, please advise. In the current scope, though, my understanding is that id is mapped to the Python builtin id() and thus will not work as an early-binding relationship.

Converting id to Base.id in the joins results in the following error:

ArgumentError: Could not locate any simple equality expressions involving locally mapped foreign key columns for primary join condition 'user_flrs.follower_id = "<name unknown>"' on relationship User.followed_users. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or are annotated in the join condition with the foreign() annotation. To allow comparison operators other than '==', the relationship can be marked as viewonly=True.


Solution

  • You can't use id in your join filters, no, because that's the built-in id() function, not the User.id column.

    You have three options:

    1. Define the relationship after creating your User model, assigning it to a new User attribute; you can then reference User.id as it has been pulled in from the base:

      class User(Base):
          # ...
      
      User.followed_users = db.relationship(
          User,
          secondary=user_flrs,
          primaryjoin=user_flrs.c.follower_id == User.id,
          secondaryjoin=user_flrs.c.followed_id == User.id,
          backref=db.backref('followers', lazy='dynamic'),
          lazy='dynamic'
      )
      
    2. Use strings for the join expressions. Any argument to relationship() that is a string is evaluated as a Python expression when configuring the mapper, not just the first argument:

      class User(Base):
          # ...
      
          followed_users = db.relationship(
              'User',
              secondary=user_flrs,
              primaryjoin="user_flrs.c.follower_id == User.id",
              secondaryjoin="user_flrs.c.followed_id == User.id",
              backref=db.backref('followers', lazy='dynamic'),
              lazy='dynamic'
          )
      
    3. Define the relationships as callables; these are called at mapper configuration time to produce the final object:

      class User(Base):
          # ...
      
          followed_users = db.relationship(
              'User',
              secondary=user_flrs,
              primaryjoin=lambda: user_flrs.c.follower_id == User.id,
              secondaryjoin=lambda: user_flrs.c.followed_id == User.id,
              backref=db.backref('followers', lazy='dynamic'),
              lazy='dynamic'
          )
      

    For the latter two options, see the sqlalchemy.orgm.relationship() documentation:

    Some arguments accepted by relationship() optionally accept a callable function, which when called produces the desired value. The callable is invoked by the parent Mapper at “mapper initialization” time, which happens only when mappers are first used, and is assumed to be after all mappings have been constructed. This can be used to resolve order-of-declaration and other dependency issues, such as if Child is declared below Parent in the same file*[.]*

    [...]

    When using the Declarative extension, the Declarative initializer allows string arguments to be passed to relationship(). These string arguments are converted into callables that evaluate the string as Python code, using the Declarative class-registry as a namespace. This allows the lookup of related classes to be automatic via their string name, and removes the need to import related classes at all into the local module space*[.]*

    [...]

    • primaryjoin

      [...]

      primaryjoin may also be passed as a callable function which is evaluated at mapper initialization time, and may be passed as a Python-evaluable string when using Declarative.

    [...]

    • secondaryjoin

      [...]

      secondaryjoin may also be passed as a callable function which is evaluated at mapper initialization time, and may be passed as a Python-evaluable string when using Declarative.

    Both the string and the lambda define the same user_flrs.c.followed_id == User.id / user_flrs.c.follower_id == User.id expressions as used in the first option, but because they are given as a string and callable function, respectively, you postpone evaluation until SQLAlchemy needs to have those declarations finalised.