Search code examples
pythonmany-to-manysqlalchemy

Python's SQLAlchemy doesn't clean out the secondary (many-to-many) table?


I have a many-to-many relationship between Users and Tasks. I want the "secondary table" (meaning, the table that facilitates the many-to-many relation) to be cleaned out when I delete a Task or User. How can I configure SQLAlchemy for this?

Here is some sample python code which demonstrates the problem I'm having. Note: This code is fully self contained, and only requires the sqlalchemy module. If you copy and paste this code, you should be able to run it without any side effects and see the same behavior yourself. The last line of the script shows that the relevant row in the "secondary table" was not removed when I removed the corresponding task. All the assertions pass in this example.

from sqlalchemy import create_engine, Column, Integer, Text, Table, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session, relationship

Model = declarative_base()

class User(Model):
    __tablename__ = 'users'
    id = Column('user_id', Integer, primary_key=True)
    email = Column('email', Text, unique=True)

    def __init__(self, email):
        self.email = email

user_tasks = Table('user_tasks', Model.metadata,
    Column('user_id', Integer, ForeignKey('users.user_id')),
    Column('task_id', Integer, ForeignKey('tasks.task_id')))

class Task(Model):
    __tablename__ = 'tasks'
    id = Column('task_id', Integer, primary_key=True)
    description = Column('description', Text)
    assigned_to = relationship('User', secondary=user_tasks, backref='tasks')

    def __init__(self, description):
        self.description = description

if __name__ == '__main__':
    engine = create_engine('sqlite:///:memory:')
    Model.metadata.create_all(engine)
    s = Session(engine)
    the_user = User('user')
    s.add(the_user)
    s.commit()
    assert s.query(User).all() == [the_user]
    user_task = Task('user_one task')
    user_task.assigned_to.append(the_user)
    s.add(user_task)
    s.commit()
    assert s.query(Task).all() == [user_task]
    assert s.query(user_tasks).all() == [(1,1)]
    s.query(Task).delete()
    s.commit()
    assert s.query(Task).all() == []
    assert s.query(User).all() == [the_user]
    assert s.query(user_tasks).all() == [(1,1)]  # I was expecting [] .

Solution

  • See delete(synchronize_session='evaluate'):

    The method does not offer in-Python cascading of relationships - it is assumed that ON DELETE CASCADE is configured for any foreign key references which require it. The Session needs to be expired (occurs automatically after commit(), or call expire_all()) in order for the state of dependent objects subject to delete or delete-orphan cascade to be correctly represented.

    That is, SQLAlchemy isn't able to find all the Task objects you're deleting and figure out each row to be deleted from user_tasks - the best way to do this is to use ON DELETE CASCADE on the foreign keys (won't work with MySQL MyISAM tables or SQLite if foreign keys aren't enabled):

    http://docs.sqlalchemy.org/en/latest/core/constraints.html#on-update-and-on-delete