Search code examples
pythonsqlalchemymany-to-many

SQLAlchemy order_by many to many relationship through association proxy


I have a many to many relationship setup in a Flask app in SQLAlchemy using a Association Object. I then have have assocation proxies setup between the the classes, to give more direct access rather than going via the association object.

Here is an abbreviated example of the setup:

class Person(Model):
    __tablename__ = 'persons'
    id = Column(Integer, primary_key=True)
    last_name = Column(Text, nullable=False)
    groups = association_proxy('group_memberships', 'group')
    # Other stuff

class Group(Model):
    __tablename__ = 'groups'
    id = Column(Integer, primary_key=True)
    name = Column(Text, nullable=False)
    members = association_proxy('group_memberships', 'person')
    # Other stuff

class GroupMembership(Model):
    __tablename__ = 'group_memberships'
    id = Column(Integer, primary_key=True)
    person_id = Column(Integer, ForeignKey('persons.id'), nullable=False)
    group_id = Column(Integer, ForeignKey('groups.id'), nullable=False)
    person = relationship('Person', uselist=False, backref=backref('group_memberships', cascade='all, delete-orphan'))
    group = relationship('Group', uselist=False, backref=backref('group_memberships', cascade='all, delete-orphan'))    
    # Other stuff

What I cannot for the life of me figure out is how to get the members returned by group.members to be sorted by their last_name?


Solution

  • In order to sort group.members you have to have the Persons available for sorting while loading the GroupMembership association objects. This can be achieved with a join.

    In your current configuration accessing group.members first loads the GroupMembership objects, filling group.group_memberships relationship, and then fires a SELECT for each Person as the association proxy accesses the GroupMembership.person relationship attributes.

    Instead you want to load both the GroupMemberships and Persons in the same query, sorted by Person.last_name:

    class GroupMembership(Model):
        __tablename__ = 'group_memberships'
        id = Column(Integer, primary_key=True)
        person_id = Column(Integer, ForeignKey('persons.id'), nullable=False)
        group_id = Column(Integer, ForeignKey('groups.id'), nullable=False)
        person = relationship('Person',
                              backref=backref('group_memberships',
                                              cascade='all, delete-orphan'),
                              lazy='joined', innerjoin=True,
                              order_by='Person.last_name')
        group = relationship('Group', backref=backref('group_memberships',
                                                      cascade='all, delete-orphan'))
        # Other stuff
    

    You need to define the order_by='Person.last_name' on the scalar relationship attribute GroupMembership.person instead of the backref Group.group_memberships, which could seem like the logical thing to do. On the other hand order_by "indicates the ordering that should be applied when loading these items", so it makes sense when using joined loading. Since you'll be joining a many-to-one reference and the foreign key is not nullable, you can use an inner join.

    With the given definition in place:

    In [5]: g = Group(name='The Group')
    
    In [6]: session.add_all([GroupMembership(person=Person(last_name=str(i)), group=g)
       ...:                  for i in range(30, 20, -1)])
    
    In [7]: session.commit()
    
    In [8]: g.members
    2017-06-29 09:17:37,652 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
    2017-06-29 09:17:37,653 INFO sqlalchemy.engine.base.Engine SELECT groups.id AS groups_id, groups.name AS groups_name 
    FROM groups 
    WHERE groups.id = ?
    2017-06-29 09:17:37,653 INFO sqlalchemy.engine.base.Engine (1,)
    2017-06-29 09:17:37,655 INFO sqlalchemy.engine.base.Engine SELECT group_memberships.id AS group_memberships_id, group_memberships.person_id AS group_memberships_person_id, group_memberships.group_id AS group_memberships_group_id, persons_1.id AS persons_1_id, persons_1.last_name AS persons_1_last_name 
    FROM group_memberships JOIN persons AS persons_1 ON persons_1.id = group_memberships.person_id 
    WHERE ? = group_memberships.group_id ORDER BY persons_1.last_name
    2017-06-29 09:17:37,655 INFO sqlalchemy.engine.base.Engine (1,)
    Out[8]: [<__main__.Person object at 0x7f8f014bdac8>, <__main__.Person object at 0x7f8f014bdba8>, <__main__.Person object at 0x7f8f014bdc88>, <__main__.Person object at 0x7f8f01ddc390>, <__main__.Person object at 0x7f8f01ddc048>, <__main__.Person object at 0x7f8f014bdd30>, <__main__.Person object at 0x7f8f014bde10>, <__main__.Person object at 0x7f8f014bdef0>, <__main__.Person object at 0x7f8f014bdfd0>, <__main__.Person object at 0x7f8f0143b0f0>]
    
    In [9]: [p.last_name for p in _]
    Out[9]: ['21', '22', '23', '24', '25', '26', '27', '28', '29', '30']
    

    A downside of this solution is that the person relationship is always loaded eagerly and the ORDER BY applied when querying for GroupMemberships:

    In [11]: session.query(GroupMembership).all()
    2017-06-29 12:33:28,578 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
    2017-06-29 12:33:28,578 INFO sqlalchemy.engine.base.Engine SELECT group_memberships.id AS group_memberships_id, group_memberships.person_id AS group_memberships_person_id, group_memberships.group_id AS group_memberships_group_id, persons_1.id AS persons_1_id, persons_1.last_name AS persons_1_last_name 
    FROM group_memberships JOIN persons AS persons_1 ON persons_1.id = group_memberships.person_id ORDER BY persons_1.last_name
    2017-06-29 12:33:28,578 INFO sqlalchemy.engine.base.Engine ()
    Out[11]: 
        ...
    

    ...unless another loading strategy is used explicitly:

    In [16]: session.query(GroupMembership).options(lazyload('person')).all()
    2018-04-05 21:10:52,404 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
    2018-04-05 21:10:52,405 INFO sqlalchemy.engine.base.Engine SELECT group_memberships.id AS group_memberships_id, group_memberships.person_id AS group_memberships_person_id, group_memberships.group_id AS group_memberships_group_id 
    FROM group_memberships
    2018-04-05 21:10:52,405 INFO sqlalchemy.engine.base.Engine ()
    

    If you need alternate orderings from time to time, you'll have to revert to issuing a full query that uses explicit eager loading and order by:

    In [42]: g = session.query(Group).\
        ...:     filter_by(id=1).\
        ...:     join(GroupMembership).\
        ...:     join(Person).\
        ...:     options(contains_eager('group_memberships')
        ...:             .contains_eager('person')).\
        ...:     order_by(Person.last_name.desc()).\
        ...:     one()
        ...:             
    
    In [43]: [m.last_name for m in g.members]
    Out[43]: ['30', '29', '28', '27', '26', '25', '24', '23', '22', '21']