Search code examples
pythonsqlalchemy

Populate relationship in SQLAlchemy with `query().join()`


Consider the following models

class A(Base):
    __tablename__ = "as"

    id = mapped_column(Integer, primary_key=True)
    b_id = mapped_column(ForeignKey("bs.id"))
    b: Mapped[B] = relationship()


class B(Base):
    __tablename__ = "bs"

    id = mapped_column(Integer, primary_key=True)
    c_id = mapped_column(ForeignKey("cs.id"))
    c: Mapped[C] = relationship()
    x = mapped_column(Integer)


class C(Base):
    __tablename__ = "cs"

    id = mapped_column(Integer, primary_key=True)
    y = mapped_column(Integer)

I want to query A objects with constraints on the associated .b.x and .b.c.y value, and I want the resulting A objects to have populated fields (not lazy).

  1. If I use joinedload(A.b).joinedload(B.c), I cannot apply constraints directly:
select(A)
  .options(joinedload(A.b).joinedload(B.c))
  .where(B.x == 0, C.y == 0)
SELECT `as`.id, `as`.b_id, cs_1.id AS id_1, cs_1.y, bs_1.id AS id_2, bs_1.c_id, bs_1.x
FROM `as` LEFT OUTER JOIN bs AS bs_1 ON bs_1.id = `as`.b_id LEFT OUTER JOIN cs AS cs_1 ON cs_1.id = bs_1.c_id, bs, cs
WHERE bs.x = ? AND cs.y = ?

and as you can see, the constraints are not on the joined table, so the query is incorrect.

  1. If I use .has(), I get the proper results but this creates am inefficient query (in practice I have many more constraints and large tables):
select(A)
  .options(joinedload(A.b).joinedload(B.c))
  .where(
    A.b.has(
      and_(B.x == 0, b.c.has(C.y == 0))
    )
  )
SELECT `as`.id, `as`.b_id, cs_1.id AS id_1, cs_1.y, bs_1.id AS id_2, bs_1.c_id, bs_1.x
FROM `as` LEFT OUTER JOIN bs AS bs_1 ON bs_1.id = `as`.b_id LEFT OUTER JOIN cs AS cs_1 ON cs_1.id = bs_1.c_id
WHERE EXISTS (SELECT 1
FROM bs
WHERE bs.id = `as`.b_id AND bs.x = ? AND (EXISTS (SELECT 1
FROM cs
WHERE cs.id = bs.c_id AND cs.y = ?)))

with (EXISTS (SELECT 1 ...)) which are inefficient and not needed here.

  1. If I use .join(), I can get clean request, but then the relationship fields do not get populated automatically:
select(A)
  .join(B, A.b_id == B.id)
  .join(C, B.c_id == C.id)
  .where(B.x == 0, C.y == 0)
SELECT `as`.id, `as`.b_id
FROM `as` INNER JOIN bs ON `as`.b_id = bs.id INNER JOIN cs ON bs.c_id = cs.id
WHERE bs.x = ? AND cs.y = ?

and as you can see, the fields for A.b and A.b.c are not included, so accessing A.b will trigger a new request (same for A.b.c of course).

  1. If I combine .join and joinedload, the request is not valid:
select(A)
  .join(B, A.b_id == B.id)
  .join(C, B.c_id == C.id)
  .where(B.x == 0, C.y == 0)
  .options(joinedload(A.b).joinedload(B.c))
SELECT `as`.id, `as`.b_id, cs_1.id AS id_1, cs_1.y, bs_1.id AS id_2, bs_1.c_id, bs_1.x 
FROM `as` INNER JOIN bs ON `as`.b_id = bs.id INNER JOIN cs ON bs.c_id = cs.id LEFT OUTER JOIN bs AS bs_1 ON bs_1.id = `as`.b_id LEFT OUTER JOIN cs AS cs_1 ON cs_1.id = bs_1.c_id
WHERE bs.x = ? AND cs.y = ?

contains 4 joins instead of two.


Is there a way to get A.b and A.b.c populated as-if I used joinedload, but using a SQL requests similar to .join()?


Solution

  • How's this:

    from sqlalchemy.orm import contains_eager
    
    query = (
        select(A)
        .join(A.b)
        .join(B.c)
        .where(and_(B.x == 0, C.y == 0))
        .options(contains_eager(A.b), contains_eager(A.b, B.c))
    )
    

    Original answer:

    select(A)
        .join(A.b)
        .join(B.c)
        .where(and_(B.x == 0, C.y == 0))
        .options(joinedload(A.b).joinedload(B.c))