Search code examples
pythonpostgresqlsqlalchemyfull-text-searchalembic

SQLAlchemy + Alembic: How to create full text search index for column with weights using Alembic revision


I am trying to use Alembic to generate a revision that will create full text search indices for some columns in my tables. I've recently upgraded to SQLAlchemy 2.0 as well as PG15. Whenever I try to run an alembic upgrade head, I get a SyntaxError.

The Alembic code I'm using is:

op.create_index('idx_obj_eng_ts_vector', 'Objects', [sa.text('(setweight(to_tsvector("english", col_name_1), \'A\') || setweight(to_tsvector("english", col_name_2), \'B\')) || setweight(to_tsvector("english", col_name_3), \'C\')')], unique=False, postgresql_using='gin')

The model code I use to generate that is:

def create_tsvector(
    field_with_weights: list[tuple[str, str]], ts_config: str = "english"
):
    field, weight = field_with_weights[0]
    exp = func.setweight(
        func.to_tsvector(literal_column(f'"{ts_config}"'), field), weight
    )

    for field, weight in field_with_weights[1:]:
        exp = op(
            exp,
            "||",
            func.setweight(
                func.to_tsvector(literal_column(f'"{ts_config}"'), field), weight
            ),
        )

    return exp

Which is called like:

class Objects(Base):
    __tablename__ = "Objects"
    
    col_name_1 = Column(String)
    col_name_2 = Column(String)
    col_name_3 = Column(String)

    ...

    __object_eng_ts_vector__ = create_tsvector(
        field_with_weights=[(col_name_1, "A"), (col_name_2, "B"), (col_name_3, "C")],
        ts_config="english",
    )

    __table_args__ = (
        Index("idx_object_eng_ts_vector", __object_eng_ts_vector__, postgresql_using="gin"),
    )

The error I am getting when I call alembic upgrade head is:

psycopg2.errors.SyntaxError: syntax error at or near "||"
Sep 26 02:11:28 AM  LINE 1: ...| setweight(to_tsvector("english", col_name_2), 'B')) || setweig...
Sep 26 02:11:28 AM                                                               ^
Sep 26 02:11:28 AM  
Sep 26 02:11:28 AM  
Sep 26 02:11:28 AM  The above exception was the direct cause of the following exception:
...
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "||"
Sep 26 02:11:28 AM  LINE 1: ...| setweight(to_tsvector("english", col_name_2), 'B')) || setweig...
Sep 26 02:11:28 AM                                                               ^
Sep 26 02:11:28 AM  
Sep 26 02:11:28 AM  [SQL: CREATE INDEX idx_object_eng_ts_vector ON "Objects" USING gin ((setweight(to_tsvector("english", col_name_1), 'A') || setweight(to_tsvector("english", col_name_2), 'B')) || setweight(to_tsvector("english", col_name_3), 'C'))]
Sep 26 02:11:28 AM  (Background on this error at: https://sqlalche.me/e/20/f405)

But my code/syntax looks fine to me based on the documentation/other StackOverflow answers.

I based my code off of:

  1. Create a Full Text Search index with SQLAlchemy on PostgreSQL
  2. SQLAlchemy: No literal value renderer is available for literal value "'english'" with datatype REGCONFIG

Solution

  • The '||' was causing issues when calling op.create_index, so I ended up switching to op.execute instead using raw SQL text.

    I also had to replace the literal_column(f'"{ts_config}"'), field) with func.to_tsvector(bindparam("ts_config", ts_config, type_=Text) to avoid the CompileError mentioned in the second link of my question, as well as the ProgrammingError mentioned here in this Github issue comment thread.

    Edit: using literal_column should work by reversing the quotes used, so literal_column(f"'{ts_config}'") should work as well.