Search code examples
sqlalchemycascadingself-referencing-table

SQLAlchemy: How to query an optional column on a self referential adjacency list table?


I have the following Table model representing a timeline.

class TimeRange(Base):


    __tablename__ = "time_line"


    record_id = Column(Integer, primary_key=True)
    level = Column(String, nullable=False) # e.g. "Point", "Range"
    content = Column(String, nullable=False)
    language_marker = Column(String) # this one column is optional and needs to be queried
    immediate_parent_id = Column(Integer, ForeignKey('time_line.record_id'))
    child_timelines = relationship('TimeRange', backref=backref('parent_timeline', remote_side=[record_id]))

The language_marker Column is the one that needs to be queried in a recursive manner. Not all records have such an attribute, and the business logic is: along the hierarchy lineage from the root down to the child timelines, at least one level of the TimeRange instance carries such an attribute, and the one in the lowest level should be returned. This works a little like cascading style sheet, where if the TimeRange object itself doesn't have such an attribute, just look further up one level above, util found one, and the latest defined style wins.

What is the technical direction I should look into to implement such queries? I'm using SQLAlchemy and the backend is SQLite. Thanks.


Solution

  • I would add a property derived_language_marker that starts with the current object and goes up the hierarchy of parent TimeRanges until it finds a language_marker that is not None and returns it:

     @property
     def derived_language_marker(self):
         time_range = self
         while not time_range.language_marker and time_range.parent_timeline:
             time_range = time_range.parent_timeline
         return time_range.language_marker
    

    It can be accessed just as time_range.derived_language_marker, but the drawback is that it could result in a new SELECT query for each level it checks if the parent objects are not already loaded.

    (If you are doing this for many TimeRanges and performance becomes a problem you can eager-load ancestors up to a certain depth with joins as described here as the long as the maximum possible depth is not too extreme. For further optimization, there is also this approach for SQLite.)