Search code examples
pypika

PyPika how to generate IF statement


How do I generate an IF statement in PyPika?

I am trying to generate a BigQuery query that pivots a row to a column. I found that if I use the following in a query (where date_range is from a WITH statement):

IF (date_range.kind = 'year', date_range.name, NULL) as year

then this will work. However, I haven't found a way to generate this SQL fragment in PyPika.

For completeness, this is an example of a query I need to run in BigQuery:

WITH date_range AS (
SELECT
    CAST(EXTRACT(year FROM year) as string) name,
    'year' kind,
    year start_date,
    DATE_ADD(year, INTERVAL 1 year) end_date
FROM UNNEST(GENERATE_DATE_ARRAY('2010-01-01','2020-06-01',INTERVAL 1 year)) year

UNION ALL

SELECT
    FORMAT_DATE('%B', month)||' '||EXTRACT(year FROM month) name,
    'month' kind,
    month start_date,
    DATE_ADD(month,INTERVAL 1 month) end_date
FROM
    UNNEST(GENERATE_DATE_ARRAY('2010-01-01','2020-06-01',INTERVAL 1 month)) month
)
SELECT
IF(date_range.kind='year', date_range.name, null) as year,
IF(date_range.kind='month', date_range.name, null) as month,
SUM(sales.sales_value) sales_value,
FROM sales
JOIN date_range ON sales.start_date>=date_range.start_date AND sales.end_date<date_range.end_date
GROUP BY year, month
ORDER BY year, month

The more general question I have is, is there a way to pass literal strings to PyPika so that those will be included in the resulting query string? There are several SQL fragments that Pypika does not generate (such as GENERATE_DATE_ARRAY and UNNEST, at least as far I can find) and passing the actual SQL fragment to PyPika would solve the problem.

Thanks!


Solution

  • Not sure if it applies but be sure to also check whether the CASE statement can help you.

    Other than that you can either subclass PyPika's Function class and overwrite get_sql and use that or (ab)use the CustomFunction and PseudoColumn utility classes like this:

    from pypika import CustomFunction
    
    sales_table = Table('sales')
    MyIf = CustomFunction('IF', ['condition', 'if', 'else'])
    q = Query.from_(sales_table).select(
        MyIf(PseudoColumn("date_range.kind = 'year'"), PseudoColumn("date_range.name"), None, alias="year")
    )
    

    However, I'd probably recommend making a ticket on the PyPika Github.

    Note: I wasn't able to test this.