select database-design sqlalchemy aggregate-functions

Is there something like func.group_concat() in SQLALchemy Core that will return a list of objects, rather than a string?

Is there a convention/is it possible to return an array/list of nested objects within a column, e.g. from joined many-to-many tables?

For example, say I have three normalised tables: "movies", "actors", and a junction table "actors_in_movies". I want to retrieve multiple movies, one movie per row, with a list of actors for that movie within its row.

A basic SELECT statement on the joined tables would retrieve:

"Bill & Ted's Excellent Adventure", "Keanu Reeves"
"Bill & Ted's Excellent Adventure", "Alex Winter"
"Bill & Ted's Excellent Adventure", "George Carlin"
"The Matrix", "Keanu Reeves"
"The Matrix", "Carrie-Anne Moss"
"The Matrix", "Lawrence Fishbourne"

Using func.group_concat() I could retrieve:
"Bill & Ted's Excellent Adventure", "Keanu Reeves, Alex Winter, George Carlin"
"The Matrix", "Keanu Reeves, Carrie-Anne Moss, Lawrence Fishbourne"

And with GROUP_BY and an aggregate function like COUNT I could retrieve:
"Bill & Ted's Excellent Adventure", 3
"The Matrix", 3

But does SQL (and SQL Alchemy Core by extension) support returning something where I can access nested aggregated results, like:
"Bill & Ted's Excellent Adventure", ["Keanu Reeves", "Alex Winter", "George Carlin"]
"The Matrix", ["Keanu Reeves", "Carrie-Anne Moss", "Lawrence Fishbourne"]

I understand this is a primary use case for SQLAlchemy ORM - is there a conventional way to manage this using pure SQL/SQL Alchemy Core? Does it require multiple queries? Or is it usually left to the application layer to manage?

Solution

is there a conventional way to manage this using pure SQL/SQL Alchemy Core?

In short: No, there is no such database-independent mechanism in Core. That's what ORM is for.

However, Core can take advantage of a such a feature in a specific database. For example, PostgreSQL has array_agg():

from sqlalchemy import Column, ForeignKey, Integer, MetaData, String, Table, create_engine, func, insert, select

engine = create_engine("postgresql://scott:tiger@192.168.0.199/test")

meta = MetaData()

movie = Table(
    "movie",
    meta,
    Column("id", Integer, primary_key=True, autoincrement=False),
    Column("title", String, nullable=False),
)

actor = Table(
    "actor",
    meta,
    Column("id", Integer, primary_key=True, autoincrement=False),
    Column("name", String, nullable=False),
)

actor_in_movie = Table(
    "actor_in_movie",
    meta,
    Column("actor_id", Integer, ForeignKey("actor.id"), primary_key=True),
    Column("movie_id", Integer, ForeignKey("movie.id"), primary_key=True),
)

meta.drop_all(engine)
meta.create_all(engine)

# populate example data
with engine.begin() as conn:
    conn.execute(
        insert(movie),
        [
            dict(id=1, title="Bill & Ted's Excellent Adventure"),
        ],
    )
    conn.execute(
        insert(actor),
        [
            dict(id=1, name="Reeves, Keanu"),
            dict(id=2, name="Carlin, George"),
        ],
    )
    conn.execute(
        insert(actor_in_movie),
        [
            dict(actor_id=1, movie_id=1),
            dict(actor_id=2, movie_id=1),
        ],
    )

with engine.begin() as conn:
    qry = (
        select(movie.c.title, func.array_agg(actor.c.name).label("actors"))
        .select_from(actor)
        .join(actor_in_movie)
        .join(movie)
        .group_by(movie.c.title)
    )
    result = conn.execute(qry)
    for row in result:
        print(row)
        # ("Bill & Ted's Excellent Adventure", ['Reeves, Keanu', 'Carlin, George'])
        print(type(row.actors))  # <class 'list'>
        print(type(row.actors[0]))  # <class 'str'>