Search code examples
python-3.xsqlalchemyevent-listenerrelationships

symmetry in backref relationships not working as expected with event listeners


I used to think that, in many to many relationships (perhaps for all of the relationships available)

parent.children.append(child)

and

child.parents.append(parent)

Were equivalent. I have, however, a rather complex set up which fires up event listeners for the append and remove methods of my classes.

This might be a long shot, but I have written a self contained example that explains my problem. In short, it contains:

  • Three classes: Parent, Child and Pet. where
    • Parent has a many-to-many relationship with both Child and Pet
    • Child has a one-to-many relationship with Pet
  • Three event.listens_for events
    • @sa.event.listens_for(Parent.children, 'append')
    • @sa.event.listens_for(Parent.children, 'remove')
    • @sa.event.listens_for(Parent.pets, 'remove')
  • A test setup using unittest, with two tests:
    • test_child_pet_relationship_on_parents_combined test the appending of stuff from Child, like child1.parents.append(test_parent) where child1 is an instance of Child and test_parent is an instance of Parent
    • test_child_pet_relationship_on_parents_combined_reversed does the same, but the other way around, like test_parent.children.append(child1).
  • A logging setup to log whatever is happening on each query. I have set some sloppy logger entries on each test and also in the first event listener. Namely, logs with A refer to the first test and A' refer to the second test.

asas

import logging
import sys

import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base 

# setup logger
stdout_handler = logging.StreamHandler(sys.stdout)
formatter = logging.Formatter('%(asctime)s:%(filename)s:%(lineno)d\t%(levelname)s\t%(message)s')
stdout_handler.setFormatter(formatter)

logger = logging.getLogger('sqlalchemy.engine')
logger.setLevel(logging.DEBUG)
logger.addHandler(stdout_handler)

Base = declarative_base()

# many to many relationship between parents and children
parents_children_relationship = sa.Table('parents_children_relationship',
    Base.metadata,
    sa.Column('parent_id', sa.Integer, sa.ForeignKey('parents.id')),
    sa.Column('child_id', sa.Integer, sa.ForeignKey('children.id')),
    sa.UniqueConstraint('parent_id', 'child_id'))

# many to many relationship between User and Pet
parents_pets_relationship = sa.Table('parents_pets_relationship',
    Base.metadata,
    sa.Column('parent_id', sa.Integer, sa.ForeignKey('parents.id')),
    sa.Column('pet_id', sa.Integer, sa.ForeignKey('pets.id')),
    sa.UniqueConstraint('parent_id', 'pet_id'))

class Parent(Base):
    __tablename__ = 'parents'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(64))

    # many to many relationship between parent and children
    # my case allows for a children to have many parents. Don't ask.
    children = sa.orm.relationship('Child',
                            secondary=parents_children_relationship,
                            backref=sa.orm.backref('parents', lazy='dynamic'),
                            lazy='dynamic')

    # many to many relationship between parents and pets
    pets = sa.orm.relationship('Pet',
                            secondary=parents_pets_relationship,
                            backref=sa.orm.backref('parents', lazy='dynamic'), #
                            lazy='dynamic')


    def __repr__(self):
        return '<Parent (name=%r)>' % (self.name)

class Child(Base):
    __tablename__ = 'children'
    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(64))
    # parents = <backref relationship with User model>

    # one to many relationship with pets
    pets = sa.orm.relationship('Pet', backref='child', lazy='dynamic')

    def __repr__(self):
        return '<Child (name=%r)>' % (self.name)

class Pet(Base):
    __tablename__ = 'pets'
    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String(64))
    # child = backref relationship with cities
    child_id = sa.Column(sa.Integer, sa.ForeignKey('children.id'), nullable=
True)
    # parents = <relationship backref from User>

    def __repr__(self):
        return '<Pet (name=%r)>' % (self.name)


from sqlalchemy.orm import object_session

@sa.event.listens_for(Parent.children, 'append')
def _on_append_children(parent, child, initiator):
    """
    If a new child is appended to the parent, this listener
    will also add the pets bound to the child being bound to the parent.
    """
    # appends also the pets bound to the child that the 
    # parent is being appended to

    logger.debug(f'**********1. adding the pets of {child} to {parent}***************')

    object_session(parent).execute(
        "INSERT INTO parents_pets_relationship VALUES "
        "(:parent_id, :pet_id)",
        [
            {"parent_id": parent.id, "pet_id": pet.id}
            for pet in child.pets
        ]
    )

    logger.debug('**********1. done!***************')


@sa.event.listens_for(Parent.children, 'remove')
def _on_remove_children(parent, child, initiator, *args, **kwargs):
    """
    If a child is removed from the parent, this listener
    will also remove only remove_single_pet --> <Pet>
    """

    object_session(parent).execute(
        "DELETE FROM parents_pets_relationship WHERE "
        "parent_id=:parent_id AND pet_id=:pet_id",
        [
            {"parent_id": parent.id, "pet_id": pet.id}
            for pet in child.pets
        ]
    )


@sa.event.listens_for(Parent.pets, 'remove')
def _on_remove_pets(parent, pet, initiator, *args, **kwargs):
    """
    If a pet is removed from the parent, and the parent also is related
    to the child that has access to that pet, then

    * removes relationship with the child, and
    * keeps relationship with the remaining pets, except the one that was 
removed
    """

    object_session(parent).execute(
        "DELETE FROM parents_children_relationship WHERE "
        "parent_id=:parent_id AND child_id=:child_id",
        {"parent_id": parent.id, "child_id": pet.child.id}
    ) 

#### test ###

import unittest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

class BasicTestModelCase(unittest.TestCase):

    def setUp(self):
        self.engine = create_engine("sqlite://", echo=False)
        Base.metadata.create_all(self.engine)

        Session = sessionmaker(bind=self.engine)
        self.session = Session()


    def tearDown(self):
        Base.metadata.drop_all(bind=self.engine)

    def test_child_pet_relationship_on_parents_combined(self):
        """
        Test that a parent can be hold children and pets that don't
        belong necessary to the child, given the behaviour tested in the 
        previous test.
        """

        # create new parent
        test_parent = Parent(name='test_parent')

        child1 = Child(id=1,
                        name='FakeChild1')

        child2 = Child(id=2,
                        name='FakeChild2')

        pet1 = Pet(id=1,
                    name='FakePet1',
                    child_id=1)

        pet2 = Pet(id=2,
                    name='FakePet2',
                    child_id=2)

        pet3 = Pet(id=3,
                    name='FakePet3',
                    child_id=1)

        self.session.add(test_parent)
        self.session.add(child1)
        self.session.add(child2)
        self.session.add(pet1)
        self.session.add(pet2)
        self.session.add(pet3)
        self.session.commit()

        # add parent to the child
        logger.debug('************A - add test_parent to child1***************')
        child1.parents.append(test_parent)
        self.session.add(child1)
        self.session.commit()
        logger.debug('**********A - done!***************')

        # add parent to the child
        pet2.parents.append(test_parent)
        logger.debug('************B - add test_parent to child1***************')

        # persist changes in the db
        self.session.add(pet2)
        self.session.commit()
        logger.debug('**********B - done!***************')

        print(test_parent.pets.all())
        print(child2.pets.all())

        # check that previous relationships are intact
        self.assertTrue(child1.pets.all() == [pet1, pet3])
        self.assertTrue(child2.pets.all() == [pet2])

        # resultant elements should be only child1, its pets and the single Pet
        self.assertTrue(test_parent.children.all() == [child1])
        self.assertTrue(test_parent.pets.all() == [pet1, pet2, pet3])

        # remove child from parent
        logger.debug('***********C - remove test_parent from pet3****************')
        pet3.parents.remove(test_parent) ## ERROR here
        logger.debug('**********C - done!***************')

        # resultant elements should be remaining pets, and no child
        self.assertTrue(test_parent.children.all() == [])
        self.assertTrue(test_parent.pets.all() == [pet1, pet2]) # pet2 was not touched, 
                                                                # but pet1 should remain 
                                                                # since only                                                                 
                                                                # pet3 was removed                                                                
                                                                # child1 should be also removed since                                                                 
                                                                # relationship is unbalanced, i.e.                                                                
                                                                # user can't have access to a child if it                                                                
                                                                # does not have access to all of the child's pets

    def test_child_pet_relationship_on_parents_combined_reversed(self):
        """
        Test that a parent can hold children and pets that don't
        belong necessary to the child.
        """

        # create new parent
        test_parent = Parent(name='test_parent')

        child1 = Child(id=1,
                        name='FakeChild1')

        child2 = Child(id=2,
                        name='FakeChild2')

        pet1 = Pet(id=1,
                    name='FakePet1',
                    child_id=1)

        pet2 = Pet(id=2,
                    name='FakePet2',
                    child_id=2)

        pet3 = Pet(id=3,
                    name='FakePet3',
                    child_id=1)

        self.session.add(test_parent)
        self.session.add(child1)
        self.session.add(child2)
        self.session.add(pet1)
        self.session.add(pet2)
        self.session.add(pet3)
        self.session.commit()


        logger.debug('************A` - add child1 to test_parent***************')
        # add parent to the child
        test_parent.children.append(child1)
        self.session.add(test_parent)
        self.session.commit()
        logger.debug('**********A` - done!***************')


        logger.debug('************B` - add pet2 to test_parent***************')
        # add parent to the child
        test_parent.pets.append(pet2)

        # persist changes in the db
        self.session.add(test_parent)
        self.session.commit()
        logger.debug('**********B` - done!***************')

        # check that previous relationships are intact
        self.assertTrue(child1.pets.all() == [pet1, pet3])
        self.assertTrue(child2.pets.all() == [pet2])

        # resultant elements should be only child1, its pets and the single Pet
        self.assertTrue(test_parent.children.all() == [child1])
        self.assertTrue(test_parent.pets.all() == [pet1, pet2, pet3])

        # remove child from parent
        logger.debug('***********C` - remove pet3 from test_parent****************')
        test_parent.pets.remove(pet3) 
        logger.debug('**********C` - done!***************')

        # resultant elements should be remaining pets, and no child
        self.assertTrue(test_parent.children.all() == [])
        self.assertTrue(test_parent.pets.all() == [pet1, pet2]) # pet2 was not touched, 
                                                                # but pet1 should remain 
                                                                # since only                                                                 
                                                                # pet3 was removed                                                                
                                                                # child1 should be also removed since                                                                 
                                                                # relationship is unbalanced, i.e.                                                                
                                                                # user can't have access to a child if it                                                                
                                                                # does not have access to all of the child's pets



import sys

if __name__ == '__main__':
    # # run tests
    unittest.main()

The first test passes, but the second does not. For the first test,

2018-05-21 11:52:50,646:pets2.py:195    DEBUG   ************A - add test_parent to child1***************
2018-05-21 11:52:50,648:base.py:682 INFO    BEGIN (implicit)
2018-05-21 11:52:50,649:base.py:1151    INFO    SELECT children.id AS children_id, children.name AS children_name 
FROM children 
WHERE children.id = ?
2018-05-21 11:52:50,649:base.py:1154    INFO    (1,)
2018-05-21 11:52:50,650:result.py:681   DEBUG   Col ('children_id', 'children_name')
2018-05-21 11:52:50,650:result.py:1106  DEBUG   Row (1, 'FakeChild1')
2018-05-21 11:52:50,652:base.py:1151    INFO    SELECT parents.id AS parents_id, parents.name AS parents_name 
FROM parents 
WHERE parents.id = ?
2018-05-21 11:52:50,652:base.py:1154    INFO    (1,)
2018-05-21 11:52:50,652:result.py:681   DEBUG   Col ('parents_id', 'parents_name')
2018-05-21 11:52:50,652:result.py:1106  DEBUG   Row (1, 'test_parent')
2018-05-21 11:52:50,652:pets2.py:91 DEBUG   **********1. adding the pets of <Child (name='FakeChild1')> to <Parent (name='test_parent')>***************
2018-05-21 11:52:50,654:base.py:1151    INFO    INSERT INTO parents_children_relationship (parent_id, child_id) VALUES (?, ?)
2018-05-21 11:52:50,654:base.py:1154    INFO    (1, 1)
2018-05-21 11:52:50,656:base.py:1151    INFO    SELECT pets.id AS pets_id, pets.name AS pets_name, pets.child_id AS pets_child_id 
FROM pets 
WHERE ? = pets.child_id
2018-05-21 11:52:50,656:base.py:1154    INFO    (1,)
2018-05-21 11:52:50,657:result.py:681   DEBUG   Col ('pets_id', 'pets_name', 'pets_child_id')
2018-05-21 11:52:50,657:result.py:1106  DEBUG   Row (1, 'FakePet1', 1)
2018-05-21 11:52:50,657:result.py:1106  DEBUG   Row (3, 'FakePet3', 1)
2018-05-21 11:52:50,658:base.py:1151    INFO    INSERT INTO parents_pets_relationship VALUES (?, ?)
2018-05-21 11:52:50,658:base.py:1154    INFO    ((1, 1), (1, 3))
2018-05-21 11:52:50,658:pets2.py:102    DEBUG   **********1. done!***************
2018-05-21 11:52:50,658:base.py:722 INFO    COMMIT
2018-05-21 11:52:50,659:pets2.py:199    DEBUG   **********A - done!***************

But for the second test, The logs and the error traceback is verbose, but the failing part is

2018-05-21 11:52:50,723:pets2.py:274    DEBUG   ************A` - add child1 to test_parent***************
2018-05-21 11:52:50,724:base.py:682 INFO    BEGIN (implicit)
2018-05-21 11:52:50,724:base.py:1151    INFO    SELECT children.id AS children_id, children.name AS children_name 
FROM children 
WHERE children.id = ?
2018-05-21 11:52:50,724:base.py:1154    INFO    (1,)
2018-05-21 11:52:50,725:result.py:681   DEBUG   Col ('children_id', 'children_name')
2018-05-21 11:52:50,725:result.py:1106  DEBUG   Row (1, 'FakeChild1')
2018-05-21 11:52:50,726:base.py:1151    INFO    SELECT parents.id AS parents_id, parents.name AS parents_name 
FROM parents 
WHERE parents.id = ?
2018-05-21 11:52:50,726:base.py:1154    INFO    (1,)
2018-05-21 11:52:50,726:result.py:681   DEBUG   Col ('parents_id', 'parents_name')
2018-05-21 11:52:50,727:result.py:1106  DEBUG   Row (1, 'test_parent')
2018-05-21 11:52:50,727:pets2.py:91 DEBUG   **********1. adding the pets of <Child (name='FakeChild1')> to <Parent (name='test_parent')>***************
2018-05-21 11:52:50,729:base.py:1151    INFO    INSERT INTO parents_children_relationship (parent_id, child_id) VALUES (?, ?)
2018-05-21 11:52:50,729:base.py:1154    INFO    (1, 1)
2018-05-21 11:52:50,731:base.py:1151    INFO    SELECT pets.id AS pets_id, pets.name AS pets_name, pets.child_id AS pets_child_id 
FROM pets 
WHERE ? = pets.child_id
2018-05-21 11:52:50,731:base.py:1154    INFO    (1,)
2018-05-21 11:52:50,731:result.py:681   DEBUG   Col ('pets_id', 'pets_name', 'pets_child_id')
2018-05-21 11:52:50,732:result.py:1106  DEBUG   Row (1, 'FakePet1', 1)
2018-05-21 11:52:50,732:result.py:1106  DEBUG   Row (3, 'FakePet3', 1)
2018-05-21 11:52:50,732:base.py:1151    INFO    INSERT INTO parents_pets_relationship VALUES (?, ?)
2018-05-21 11:52:50,732:base.py:1154    INFO    ((1, 1), (1, 3))
2018-05-21 11:52:50,733:pets2.py:102    DEBUG   **********1. done!***************
2018-05-21 11:52:50,735:base.py:1151    INFO    INSERT INTO parents_children_relationship (parent_id, child_id) VALUES (?, ?)
2018-05-21 11:52:50,735:base.py:1154    INFO    (1, 1)
2018-05-21 11:52:50,735:base.py:702 INFO    ROLLBACK

Based on the logger calls, "something" is happening after the event listener for Parent.children.append is finished and before pet2.parents_pets.append(test_parent) is called. This results in a row in the association table twice, why is this happening?

The specific error traceback from the test is

sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: parents_children_relationship.parent_id, parents_children_relationship.child_id [SQL: 'INSERT INTO parents_children_relationship (parent_id, child_id) VALUES (?, ?)'] [parameters: (1, 1)]

This question is related to

override relationship behaviour in sqlalchemy

And also to the same question in the Sqlalchemy's mailing list

https://groups.google.com/forum/#!topic/sqlalchemy/jgKgv5zQT7E

I know is a lot to process, but I wonder why is this relationship not working symmetrically i.e why is one test passing but the other is not?

Thanks!

EDIT:

If I replace the event listener for one that listens on the counterpart append method (that is, the method for the Class that is the other side in the many to many relationship), for something like

@sa.event.listens_for(Child.parents_children, 'append')
def _on_append_children(child, parent, initiator):
    """
    If a new child is appended to the parent, this listener
    will also add the pets bound to the child being bound to the parent.
    """
    # appends also the pets bound to the child that the 
    # parent is being appended to

    logger.debug(f'**********1. (reversed) Adding the pets of {child} to {parent}***************')

    object_session(child).execute(
        "INSERT INTO parents_pets_relationship VALUES "
        "(:parent_id, :pet_id)",
        [
            {"parent_id": parent.id, "pet_id": pet.id}
            for pet in child.pets
        ]
    )

    logger.debug('**********1. done!***************')

Then the first test fails and the second passes, which kinda isolates the issue at the listeners, and probably related to the object_session object.


Solution

  • From the SQLAlchemy's mailing list (I edited it for context, credits to @zzzeek for his patience and help!)

    the issue is because the addition of test_parent to child1.parents triggers two individual "dirty" events, which each resolve to the same INSERT operation into the parents_children_relationship table. Normally, these two dirty events are resolved together during the flush process. However, within your "insert" event handler, calling upon the "child.pets" collection, since it is a dynamic relationship, triggers autoflush. So, the first dirty event for parent being appended to child is handled, the row is inserted within autoflush. then your event completes. then the backref handler goes off and appends child to parent, triggering the second dirty event. the session.commit() then tries to flush the same thing a second time which fails.

    the solution is to not flush within the event handlers.

    sess = object_session(parent)
    with sess.no_autoflush:
        sess.execute(
            "INSERT INTO parents_pets_relationship VALUES "
            "(:parent_id, :pet_id)",
            [
                {"parent_id": parent.id, "pet_id": pet.id}
                for pet in child.pets
            ]
        )
    

    Changing this part on the listeners make both tests pass