Search code examples
pythonsqlalchemydistinct

Best away to distinct a list of objects by two fields equals with sqlalchemy


I have some duplicated records in my database with different Ids, like:

id      | title         | date              |flag |  
--------+---------------+-------------------+-----+
    2099|Test           |2022-11-29 00:00:00| 1   |
    2100|Test           |2022-11-29 00:00:00| 1   |
    2101|Test           |2022-11-29 00:00:00| 1   |
    2102|Test           |2022-11-29 00:00:00| 1   |

and this is the code to get all records from my database:

event_get = Event.query.filter(Event.flag == 1)
result = events_schema.dump(event_get)
data = dict(event=result)

I wanna distinct the results by title and date, so as the example with four records, I would have all of them with the same title and date, returning only one record

I did not find a way of do this with sqlalchemy, is It possible or I would need to use a alternative code in the returned list?


Solution

  • Pass the columns to distinct() to generate distinct on like this:

    events = session.query(Event).distinct(Event.title, Event.date).filter(Event.flag == 1).all()
    

    This uses a subquery that finds the smallest id for each group of title/date. The main query then only returns events with ids in that subquery. I guess check and see if this works in mysql. My small test seems to work.

    from sqlalchemy.sql import select, func
    subq = select(func.min(Event.id)).group_by(Event.title, Event.date)
    events = session.query(Event).where(Event.id.in_(subq)).all()
    

    Also if you just need those exact values you can probably just use regular DISTINCT like this:

    for (title, date) in session.query(Event.title, Event.date).distinct().all():
        print (title, date)