Search code examples
google-bigquery

How to IGNORE NULLS in ANY_VALUE aggregation when using HAVING?


From the ANY_VALUE docs

ANY_VALUE behaves as if IGNORE NULLS is specified; rows for which expression is NULL are not considered and won't be selected.

and from the HAVING MAX docs

This clause ignores NULL values when computing the maximum value unless having_expression evaluates to NULL for all rows.

I expected this query

WITH
  foo AS (
    SELECT 1 AS a, "hello" AS b, 10 as some_timestamp
    UNION ALL
    SELECT 1, NULL, 11
    UNION ALL
    SELECT 2, "hi", 99
  )
SELECT
  a,
  ANY_VALUE(b HAVING MAX some_timestamp) latest_b
FROM
  foo
GROUP BY
  a

to return

a latest_b
1 hello
2 hi

but instead, it returns

a latest_b
1 null
2 hi

The implicit IGNORE NULLS in ANY_VALUE only happens when not using HAVING.

Is there a way to have IGNORE NULLS in ANY_VALUE even when using HAVING? (I'd like to avoid switching to WINDOW with LAST_VALUE(B IGNORE NULLS) OVER ... instead.)


Solution

  • You cannot play with returned value, however you can with the value you're maxing out (mind you max is ignoring nulls on default):

    SELECT
      a,
      ANY_VALUE(b HAVING MAX if(b is null, null, some_timestamp)) latest_b
    FROM
      foo
    GROUP BY
      a