Search code examples
google-cloud-platformgoogle-bigquerybigquery-udf

Workarounds for emulating UDFs for window aggregates in BigQuery?


I'm trying to write custom aggregate functions in BigQuery. In PGSQL, I can write user-defined aggregate functions which can be used with over clause, but I'm not able to write any such aggregate functions for BigQuery - would it be possible to write a function that takes in an entire array of a column of a partition and return a value based on some custom calculation?

Example of what I tried:

CREATE OR REPLACE FUNCTION temp_db.temp_func(arr ARRAY<int64>)
RETURNS int64 LANGUAGE js AS """
  return arr.length*10 //example calculation
  //actual result involves looping over the array and doing few calculations 
""";

select s_id, temp_db.temp_func(s_price) over (partition by s_id order by s_date rows 40 preceding) as temp_col
from temp_db.s_table;

This gives an error: Query error: Function temp_db.temp_func does not support an OVER clause at [6:19]

The existing aggregate functions are not sufficient for my purpose so I need to be able to execute custom calculation over custom window sizes. Are there any workarounds for this in BigQuery?


Solution

  • CREATE OR REPLACE FUNCTION temp_db.temp_func(arr ARRAY<int64>)
    RETURNS int64 LANGUAGE js AS """
      return arr.length*10 //example calculation
      //actual result involves looping over the array and doing few calculations 
    """;
    
    select s_id, temp_db.temp_func(ARRAY_AGG(s_price) over (partition by s_id order by s_date rows 40 preceding)) as temp_col
    from temp_db.s_table;