Search code examples
pythonsqlsql-serversqlalchemyouter-apply

OUTER APPLY in SQLAlchemy


I want to write a SQL Server's outer apply query like the one below using SQLAlchemy.

The FX table may not have the corresponding row on the date in the price table, so I need to use to OUTER APPLY to get the last row in the FX table for each date.

SELECT p.EffectiveDate, p.Close_ * FX.Rate as USD_PRICE
FROM PRICE p
OUTER APPLY (
    SELECT TOP 1 *
    FROM FX
    WHERE 
        FromCurrency = p.Currency
        AND ToCurrency = 'USD'
        AND ExRateDate <= p.EffectiveDate
    ORDER BY ExRateDate DESC
) fx

Brief background on the tables:

  • The PRICE table is a timeseries which has the EffectiveDate, Currency and Close_ Column. EffectiveDate is the primary key.
  • The FX table has the FromCurrCode, ToCurrCode, ExRateDate and Rate columns. The primary key is (FromCurrCode, ToCurrCode, ExRateDate)
  • My use case is to join the PRICE and FX tables to get the USD price. However, for a given EffectiveDate in the PRICE table, there may not be row in FX table on that date. Therefore, the tables cannot be joinned directly with FX.ExRateDate = PRICE.EffectiveDate. To fix the problem, I need to use OUTER APPLY to find the last row in the FX table that the FX.ExRateDate is the closest to the PRICE.EffectiveDate

It seems that SQLAlchemy doesn't support outer apply expression. I took a look at Custom SQL Constructs and Compilation Extension. But I am not sure how to create a custom constructs of outer apply. Do you have an example of it?

I guess a workaround would be replacing the outer apply with outer join. If you can provide a query can produce the same result without using outer apply, that would solve my problem too.

Thanks


Solution

  • Using Correlated Subquery here is the solution using OUTER JOIN:

    sq = (
        session.query(FX.id.label("last_id"))
        .filter(FX.FromCurrency == Price.Currency)
        .filter(FX.ToCurrency == 'USD')
        .filter(FX.ExRateDate <= Price.EffectiveDate)
        .order_by(FX.ExRateDate.desc())
        .order_by(FX.id.desc())  # handle duplicates just in case
        .limit(1)
        .correlate(Price)
        .as_scalar()
    )
    
    q = session.query(
        Price.EffectiveDate,
        (Price.Close_ * FX.Rate).label("USD_PRICE"),
    ).outerjoin(FX, FX.id == sq)