google-bigquery concatenation etl array-merge sql-merge

BigQuery: Concatenate two arrays and keep distinct values within MERGE statement

I am working on a MERGE process and update an array field with new data but only if the value isn't already found in the array.

target table
+-----+----------+
| id  |  arr_col |
+-----+----------+
| a   |  [1,2,3] |
| b   |    [0]   |
+-----+----------+

source table
+-----+----------+
| id  |  arr_col |
+-----+----------+
| a   | [3,4,5] |
| b   |  [0,0]   |
+-----+----------+

target table post-merge
+-----+-------------+
| id  |  arr_col    |
+-----+-------------+
| a   | [1,2,3,4,5] |
| b   |    [0]      |
+-----+-------------+

I was trying to use SQL on this answer in my MERGE statement

merge into target t
using source
  on target.id = source.id
when matched then
update set target.arr_col = array(
                             select distinct x 
                             from unnest(array_concat(target.arr_col, source.arr_col)) x
                            )

but BigQuery shows me the following error: Correlated Subquery is unsupported in UPDATE clause.

Is there any other way to update this array field via MERGE? The target and source tables can be quite large and would run daily. So it's a process I would like to have incremental updates for as opposed to recreating entire table with new data every time.

Solution

Below is for BigQuery Standard SQL

merge into target
using (
  select id, 
    array(
      select distinct x 
      from unnest(source.arr_col || target.arr_col) as x
      order by x
    ) as arr_col
  from source 
  join target
  using(id)
) source
  on target.id = source.id
when matched then
update set target.arr_col = source.arr_col;