Search code examples
sqlgoogle-bigqueryuser-defined-functions

BigQuery UDF: DECLARE/SET variable inside Standard SQL UDF body


NOTE: This may be considered a duplicate of How to declare variable inside BigQuery UDF body?, but I am not satisfied with the answers provided - None answer how to DECLARE/SET variables within the UDF, and the language documentation [1], [2] does not discuss this either.

Given the following JS UDF:

CREATE TEMP FUNCTION earliest_full_payment(
    payments ARRAY<STRUCT<event_key STRING, amount FLOAT64, td_amount FLOAT64, date DATE>>,
    td_charges_amount FLOAT64
)
RETURNS STRUCT<event_key STRING, amount FLOAT64, td_amount FLOAT64, date DATE>
LANGUAGE js
AS
r"""
    let result = null;
    for(const payment of payments) {
        const payment_amount = (payment.amount || 0)
        const td_payment_amount = (payment.td_amount || 0)
        if (payment_amount < 0) {
            if (td_payment_amount < td_charges_amount) {
                result = null;
            }
        } else {
            if (td_payment_amount >= td_charges_amount) {
                result = payment
            }
        }
    }
return result
""";

I am trying to produce the equivalent Standard SQL UDF:

CREATE TEMP FUNCTION earliest_full_payment(
    payments ARRAY<STRUCT<event_key STRING, amount FLOAT64, td_amount FLOAT64, date DATE>>,
    td_charges_amount FLOAT64
)
RETURNS STRUCT<event_key STRING, amount FLOAT64, td_amount FLOAT64, date DATE>
AS
((
    DECLARE result STRUCT<event_key STRING, amount FLOAT64, td_amount FLOAT64, date DATE>;
    SET result = NULL;
    FOR payment IN UNNEST(payments) DO
        DECLARE payment_amount FLOAT64;
        DECLARE td_payment_amount FLOAT64;
        SET payment_amount = IFNULL(payment.amount, 0);
        SET td_payment_amount = IFNULL(payment.td_amount, 0);
        IF payment_amount < 0 THEN
            IF td_payment_amount < td_charges_amount THEN
              SET result = NULL;
            END IF;
        ELSE
            IF td_payment_amount >= td_charges_amount THEN
                SET result = payment
            END IF;
        END IF;
    END FOR;
    RETURN result;
));

but I get Syntax error: Unexpected identifier "result" at [8:13]. I tried both with the DECLARE inside and outside of the UDF, and with a BEGIN..END around the statements.

Any idea how I can fix the syntax error?


Solution

  • Not sure I understood your logic in UDFs, but seems you can implement it with an UDF in SQL-lish way like below.

    • Below UDF has passed all TCs you defined
    CREATE TEMP FUNCTION earliest_full_payment(
      payments ARRAY<STRUCT<event_key STRING, amount FLOAT64, td_amount FLOAT64, date DATE>>,
      td_charges_amount FLOAT64
    ) AS ((
      SELECT CASE
               WHEN IFNULL(p.amount, 0) >= 0 AND IFNULL(p.td_amount, 0) >= td_charges_amount THEN p
               WHEN IFNULL(p.amount, 0) < 0 AND IFNULL(p.td_amount, 0) < td_charges_amount THEN null
             END
       FROM UNNEST(payments) p WITH offset
      ORDER BY offset DESC LIMIT 1
    ));