Search code examples
pythonsqlalchemy

SQLAlchemy: Parent instance not bound to a session, even though it should be?


This is yet another "Parent instance is not bound to a Session" question.

I have function that does the following (simplified):

def check_schedules(org):
   # ...
   for user in org.users:
        for schedule in user.schedules:
            schedule._set_minimum_time()

Where org is an ORM model, users is a relationship to User model, and schedules is a relationship to Schedule model

And:

class Schedule(Base):
    # ... 
    def _set_minimum_time(self):
        organization_schedule = self.user.organization.schedule

check_schedules is called in various flows, and succeeds. However in a specific flow, from within a worker's job, it raises the DetachedInstanceError error:

DetachedInstanceError: Parent instance <Schedule at 0x7f2900ab3af0> is not bound to a Session; lazy load operation of attribute 'user' cannot proceed

Do you have any explanation as to why this happens?
The session is a scoped session (created with autocommit=False and autoflush=False), there are no other threads running, and we can see that the loading of user was successfully lazy-loaded in the first loop so we'd expect it to already be in the session when schedule tries to dereference it in the _set_minimum_time function.

Python: 3.9.17
SQLAlchemy version: 1.3.24

UPDATE #1:

Upon debugging, and breaking on schedule._set_minimum_time() I can see that schedule not in db_session

In fact, all([schedule not in db_session for schedule in user.schedules]) returns True

Still not sure why this happens, but the relationship of schedules and user is defined as follows:

class User(Base):
  # ..
  schedules = relationship(
        "Schedule",
        cascade="all, delete-orphan",
        passive_deletes=False,
        back_populates="user",
    )

and

class Schedule(Base):
  # ..
  user_uuid = Column(UUID, ForeignKey("user.uuid"), nullable=False, index=True)
  user = relationship("User", back_populates="schedules")

Solution

  • After investing some time debugging this I was able to pinpoint the issue!

    To simplify, the flow that failed did the following:

    def flow_that_fails():
        do_something(org)
        check_schedules(org)
    
    def do_something(org):
        # This was some function called prior to check_schedules
        # and internally it also did some calculation with the schedules
        # Specifically it expunged the schedules before doing the calculation
        for user in org.users:
            for schedule in user.schedules:
                # ...
                db_session.expunge(schedule)
                # ... 
    

    The fact that the schedules were previously expunged made it so that the following time the relationship was iterated in check_schedules it returned the already expunged objects.

    So the small reproducer for this issue is essentially:

    schedules_uuids = []
    for schedule in user.schedules:
        schedules.append(schedule.uuid)
        db_session.expunge(schedule)
    
    for schedule in user.schedules:
        assert schedule not in db_session
    
    for schedule_uuid in user.schedules:
        schedule = db_session.query(Schedule).get(schedule_uuid)
        assert schedule in db_session
    
    for schedule in user.schedules:
        # Even though the objects were re-fetched above using the uuids
        # Going through the relationship still returns the previously expunged
        # objects
        assert schedule not in db_session
    

    I'm not a 100% this is the expected behavior, such that the relationship memoizes the expunged objects.

    In any case, this wasn't easy to debug 😅 (The way I found the problem eventually was by reproducing the exception locally and then bisecting the code, removing calls to functions that were suspected to damage the session, until I was able to pinpoint the offending functions and dive into them until I found that offending expunge)