Search code examples
pythonsqlsqlalchemyflask-sqlalchemylateral-join

Limit results from a Python SQLAlchemy query to the top results from a subgroup


I have the following tables:

User
   id
   name

Points
   id
   user_id
   total_points(int)

user_id is the foreign key on the user table. In the points table, each user can have multiple entries, for instance:

user A - 1000
user B - 1500
user A - 1250
User C - 3000
User A - 500
etc...

What I want to get is the top 3 results (total_points) from each user in the points table.

I can get all the entries on this table by doing this:

db.session.query(Points).all()

I can also get the result ordered by player:

db.session.query(Points).order_by(Points.user_id).all()

Then I can get the result ordered by highest to lowest points:

db.session.query(Points).order_by(Points.user_id).order_by(Points.total_points.desc()).all()

Now, I'm trying to get ONLY the top 3 total_points for EACH user. I'm trying to use the lateral clause but it's not working maybe because I'm not 100% sure how to use it.

Here is what I'm trying:

subquery = db.session.query(User.name, Points.total_points).join(Points, Points.user_id == User.id).filter(User.id == Points.user_id).order_by(Points.user_id).order_by(Points.total_points.desc()).limit(3).subquery().lateral('top 3')

db.session.query(User.name, Points.total_points).select_from(Points).join(subquery, db.true()))

Solution

  • What you are looking for is a function called rank()
    Here is a easy example from Postgresql doc

    SELECT depname, empno, salary,
    rank() OVER (PARTITION BY depname ORDER BY salary DESC)
    FROM empsalary;
    
     depname  | empno | salary | rank 
    -----------+-------+--------+------
     develop   |     8 |   6000 |    1
     develop   |    10 |   5200 |    2
     develop   |    11 |   5200 |    2
     develop   |     9 |   4500 |    4
     develop   |     7 |   4200 |    5
     personnel |     2 |   3900 |    1
     personnel |     5 |   3500 |    2
     sales     |     1 |   5000 |    1
     sales     |     4 |   4800 |    2
     sales     |     3 |   4800 |    2
    (10 rows)
    

    Here is some examples from another question on using sqlalchemy

    subquery = db.session.query(
        table1,
        func.rank().over(
            order_by=table1.c.date.desc(),
            partition_by=table1.c.id
        ).label('rnk')
    ).subquery()