Search code examples
jsonsqlalchemyormtrino

JSON parsing with Trino and SQLAlchemy ORM


Is there a way to use SQLAlchemy's ORM framework with a JSON column in Trino? Here's my attempt so far:

pip install trino[sqlalchemy]

(See https://github.com/trinodb/trino-python-client#sqlalchemy for installation details.)

from sqlalchemy import create_engine, select
from sqlalchemy.dialects import postgresql
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session
from sqlalchemy.schema import Table, Column, MetaData

engine = create_engine(...)

Base = declarative_base()

table = Table(
  'table',
  Base.metadata,
  Column('json_col', postgresql.JSON), # I have tried other JSON types: https://docs.sqlalchemy.org/en/20/core/type_basics.html#sqlalchemy.types.JSON
  autoload_with=engine,
  extend_existing=True)

with Session(engine_prod_a_agentservice) as session:
    
  test_ids = ['abc','def']
    
  stmt = select(table).where(
        table.c.json['key1']['key2'].in_(test_ids))
    
  print(stmt)
    
  result = session.execute(stmt)

This does not work because the generated WHERE clause is not correct (the following code being taken from the output of print(stmt)):

WHERE ((table.json_col[:json_col_1]) ->> :param_1) IN (__[POSTCOMPILE_param_2])

In the Trino dialect of SQL I would achieve this as follows:

WHERE cast(json_extract(json_col,'$.key1.key2') as varchar) IN ('abc', 'def')

Is what I want to achieve possible or is the functionality just not there (yet)?


Solution

  • As you have self-explained, you need to use the Trino 'json_extract' method. Try something like this:

    from sqlalchemy import func, String
    
    stmt = select(table).where(func.cast(func.json_extract(table.json_col, f'$.key1.key2'), String).in_(test_ids))