Search code examples
pythonsqlsqlalchemyflask-sqlalchemy

Converting SQL that uses unnest and ilike to SQLAlchemy


I have a single table, books, with multiple columns, one of the columns categories is an array of strings

select * from books where exists (select * from (select  unnest(books.categories)) x(categories) where x.categories ilike 'fiction%');

Having lots of trouble converting this to SQLAlchemy query string. What I am trying to do is to get all books that has at least a single category starting with the word 'fiction'.

(the simple query below does not work, it is just so you can easily understand what I am trying to achieve with the complex query above)

select categories from books where 'fiction%' ilike any(categories);

Solution

  • You need column valued functions, the following are lines of importance.

    column_valued = func.unnest(Book.categories).column_valued()
    statement = select(Book).where(select(column_valued).where(column_valued.ilike('fiction%')).exists())
    

    The code generates the following sql

    SELECT books.id, books.name, books.categories 
    FROM books 
    WHERE EXISTS (SELECT anon_1 
    FROM unnest(books.categories) AS anon_1 
    WHERE anon_1 ILIKE %(param_1)s::VARCHAR)
    

    Here is a complete example that shows what you want.

    from sqlalchemy import create_engine, select, func, String
    from sqlalchemy.orm import Session, DeclarativeBase, Mapped, mapped_column
    from sqlalchemy.dialects.postgresql import ARRAY
    
    class Base(DeclarativeBase):
        pass
    
    class Book(Base):
        __tablename__ = 'books'
        id: Mapped[int] = mapped_column(primary_key=True)
        name: Mapped[str]
        categories: Mapped[list[str]] = mapped_column(ARRAY(String))
    
    engine = create_engine("connection string")
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)
    
    with Session(engine) as session:
        # insert sample test data
        session.add(Book(name='Book 1', categories=['fiction', 'mystery', 'romance', 'adventure']))  # just fiction
        session.add(Book(name='Book 2', categories=['historical', 'horror']))  # no fiction
        session.add(Book(name='Book 3', categories=['fiction historical', 'horror']))  # starts with fiction
        session.add(Book(name='Book 4', categories=['horror']))  # no fiction
        session.add(Book(name='Book 5', categories=['fiction romance', 'horror']))  # starts with fiction
        session.commit()
    
    with Session(engine) as session:
        column_valued = func.unnest(Book.categories).column_valued()
        statement = select(Book).where(
            select(column_valued).where(column_valued.ilike('fiction%')).exists()
        )
    
        for i in session.scalars(statement):
            print(i.name)
    

    Output

    Book 1
    Book 3
    Book 5
    

    Books where none of the categories start with fiction are not shown here, which I believe is what you want.