Search code examples
postgresqlsqlalchemypython-asyncio

Sqlalchemy asyncio translate postgres query for GROUP_BY clause


I want to translate the below postgres query into Sqlalchemy asyncio format, but so far, I could only retrieve the first column only, or the whole row at once, while I need only to retrieve only two columns per record:

SELECT
    table.xml_uri,
    max(table.created_at) AS max_1 
FROM
    table
GROUP BY
    table.xml_uri 
ORDER BY
    max_1 DESC;

I reach out to the below translation, but this only returns the first column xml_uri, while I need both columns. I left the order_by clause commented out for now as it generates also the below error when commented in:

Sqlalchemy query:

from sqlalchemy.ext.asyncio import AsyncSession

query = "%{}%".format(query)
records = await session.execute(
    select(BaseModel.xml_uri, func.max(BaseModel.created_at))
    .order_by(BaseModel.created_at.desc())
    .group_by(BaseModel.xml_uri)
    .filter(BaseModel.xml_uri.like(query))
)

# Get all the records
result = records.scalars().all()

Error generated when commenting in order_by clause:

column "table.created_at" must appear in the GROUP BY clause or be used in an aggregate function


Solution

  • The query is returning a resultset consisting of two-element tuples. session.scalars() is taking the first element of each tuple. Using session.execute instead will provide the desired behaviour.

    It's not permissable to order by the date field directly as it isn't part of the projection, but you can give the max column a label and use that to order.

    Here's an example script:

    import sqlalchemy as sa
    from sqlalchemy import orm 
    
    Base = orm.declarative_base()
    
    
    class MyModel(Base):
        __tablename__ = 't73018397'
    
        id = sa.Column(sa.Integer, primary_key=True)
        code = sa.Column(sa.String)
        value = sa.Column(sa.Integer)
    
    
    engine = sa.create_engine('postgresql:///test', echo=True, future=True)
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)
    Session = orm.sessionmaker(engine, future=True)
    
    with Session.begin() as s:
        for i in range(10):
            # Split values based on odd or even
            code = 'AB'[i % 2 == 0]
            s.add(MyModel(code=code, value=i))
    
    
    with Session() as s:
        q = ( 
            sa.select(MyModel.code, sa.func.max(MyModel.value).label('mv'))
            .group_by(MyModel.code)
            .order_by(sa.text('mv desc'))
        )   
        res = s.execute(q)
        for row in res:
            print(row)
    

    which generates this query:

    SELECT 
        t73018397.code, 
        max(t73018397.value) AS mv   
      FROM t73018397 
      GROUP BY t73018397.code 
      ORDER BY mv desc