pythonpostgresqlsqlalchemyrow-number

Delete SQLAlchemy query using row_number()


I'm looking for an appropriate way to make query.delete() using row_number() function.

Below there is my current solution:

subquery = session.query(Foo,
                         func.row_number().over(order_by=desc(Foo.bar)).label("row_number")).subquery()
subquery = session.query(subquery).filter(subquery.c.row_number > 3)
subquery = subquery.from_self(Foo.id)

query = session.query(Foo).filter(
    Foo.id.in_(subquery)
)

I don't like that I need explicilty check if Foo.id in list of ids. I need to compare all object's columns without hardcoding them. Somthing like:

subquery = session.query(Foo,
                         func.row_number().over(order_by=desc(Foo.bar)).label("row_number")).subquery()
subquery = session.query(subquery).filter((subquery.c.row_number > 3))
subquery = subquery.from_self(Foo)

query = session.query(Foo).filter(
    Foo.in_(subquery)
)

But of course this does not work.

I would be grateful if anyone will suggest a solution.


Solution

  • After some time of investigations, I understood the following:

    1. You need to specify exact model fields for filtration. I created a method _get_reference_attrs that uses primary key fields, or all fields, if there is no primary key.
    2. Code in question used 3 nested select queries (not including delete query), but only 2 are required. For this you need to correctly collect what you are selecting in the upper selection (attrs_subquery)

    That's my solution:

    from sqlalchemy import func, desc, tuple_, inspect
    
    
    def _get_reference_attrs(model):
        """
        Collecting attributes of model that will be used in delete query
        By default the primary key columns are used. If no primary key - all columns are used.
        """
    
        mapper = inspect(model)
        primary_key_names = [primary_key.key for primary_key in mapper.primary_key]
        if primary_key_names:
            attrs = [attr for attr in mapper.attrs if attr.key in primary_key_names]
        else:
            attrs = mapper.attrs
    
        return attrs
    
    
    attrs = _get_reference_attrs(Foo)
    
    subquery = session.query(*attrs,
                             func.row_number().over(order_by=desc(Foo.bar)).label("row_number")).subquery()
    attrs_subquery = [getattr(subquery.c, attr.key) for attr in attrs]
    subquery = session.query(*attrs_subquery).select_from(subquery).filter(subquery.c.row_number > 1)
    
    query = session.query(Foo).filter(
        tuple_(*attrs).in_(subquery)
    )