Search code examples
pythonsqlalchemysubqueryscalar-subquery

SQLAlchemy creating a scalar subquery column with comparison to a column from an outer subquery table


I'm trying to write a query that is creating a scalar subquery column that references a sibling column that is a column from a subquery table. I put together a simplified example of what I'm attempting, though how I'm actually using this is a bit more elaborate.

Session = sessionmaker(bind=engine)
session = Session()

Base = declarative_base()

class A(Base):
    __tablename__ = "TestA"
    id = Column('id', Integer, primary_key=True)
    name = Column('name', String).label('name')
    metric = Column('metric', Integer).label('metric')
    increment_day = Column('increment_day', Date).label('increment_day')

class B(Base):
    __tablename__ = "TestB"
    id = Column('id', Integer, primary_key=True)
    name = Column('name', String).label('name')
    metric = Column('metric', Integer).label('metric')
    increment_day = Column('increment_day', Date).label('increment_day')

class C(Base):
    __tablename__ = "TestC"
    c_id = Column('c_id', Integer, primary_key=True)
    c_metric = Column('c_metric', Integer)
    c_increment_day = Column('c_increment_day', Date)

a_query = session.query(*[A.id, A.name, A.metric, A.increment_day,]).filter(A.increment_day=='2012-01-01')
b_query = session.query(*[B.id, B.name, B.metric, B.increment_day,]).filter(B.increment_day=='2012-01-02')
inner_query = a_query.union_all(b_query).subquery('res')
outer_query = session.query(*[inner_query.c.increment_day, 
                              func.sum(inner_query.c.metric)])
c_select = session.query(*[func.sum(C.c_metric),])\
                  .filter(C.c_increment_day==inner_query.c.increment_day)
outer_query = outer_query.add_column(c_select.as_scalar())

which generates SQL that looks like:

SELECT res.increment_day
     , sum(res.metric) AS sum_1
     , ( SELECT sum(`TestC`.c_metric) AS sum_2
           FROM `TestC`,
              , ( SELECT anon_2.name AS name
                       , anon_2.metric AS metric
                       , anon_2.increment_day AS increment_day
                    FROM ( SELECT `TestA`.id AS `TestA_id`
                                , name AS name
                                , metric AS metric
                                , increment_day AS increment_day
                             FROM `TestA`
                            WHERE increment_day = '2012-01-01'
                            UNION ALL 
                           SELECT `TestB`.id AS `TestB_id`
                                , name AS name
                                , metric AS metric
                                , increment_day AS increment_day
                             FROM `TestB`
                            WHERE increment_day = '2012-01-02'
                         ) AS anon_2
                ) AS res
          WHERE `TestC`.c_increment_day = res.increment_day
       ) AS anon_1
  FROM ( SELECT anon_2.name AS name
              , anon_2.metric AS metric
              , anon_2.increment_day AS increment_day
           FROM ( SELECT `TestA`.id AS `TestA_id`
                       , name AS name
                       , metric AS metric
                       , increment_day AS increment_day
                    FROM `TestA`
                   WHERE increment_day = '2012-01-01'
                   UNION ALL 
                  SELECT `TestB`.id AS `TestB_id`,
                       , name AS name
                       , metric AS metric
                       , increment_day AS increment_day
                    FROM `TestB`
                   WHERE increment_day = '2012-01-02'
                ) AS anon_2
       ) AS res

My question is, how can I setup my query so that the 'res' subquery is NOT repeated in the scalar column query so that the query instead looks like:

SELECT res.increment_day
     , sum(res.metric) AS sum_1
     , ( SELECT sum(`TestC`.c_metric) AS sum_2
           FROM `TestC`
          WHERE `TestC`.c_increment_day = res.increment_day
       ) AS anon_1
  FROM ( SELECT anon_2.name AS name
              , anon_2.metric AS metric
              , anon_2.increment_day AS increment_day
           FROM ( SELECT `TestA`.id AS `TestA_id`
                       , name AS name
                       , metric AS metric
                       , increment_day AS increment_day
                    FROM `TestA`
                   WHERE increment_day = '2012-01-01'
                   UNION ALL 
                  SELECT `TestB`.id AS `TestB_id`,
                       , name AS name
                       , metric AS metric
                       , increment_day AS increment_day
                    FROM `TestB`
                   WHERE increment_day = '2012-01-02'
                ) AS anon_2
       ) AS res

Solution

  • until 0.8, you have to tell Query explicitly about how you'd like it to correlate to SELECT statements outside of it:

    c_select = session.query(*[func.sum(C.c_metric),])\
                      .filter(C.c_increment_day==inner_query.c.increment_day).correlate(inner_query)