Search code examples
sqlgoogle-bigquery

Match array column with comma separated value in Bigquery Table


I have two tables in BigQuery - let's say:

Table A

column 1 value a
column 2 comma separated Values (aa,ab,ac)
column 3

Table B

column 1 Array [a,b,c,d]
column 2 Array [aa,ad]
column 4

Now I want to compare with two conditions:

  • if column 1 value from Table A is in array column 1 of Table B and any value of column 2 aa or ab or ac exists in array column 2 of Table B.
  • if this is true then write "yes" in the last column of Table A.

Solution

  • You can achieve this using a BigQuery query with UNNEST , EXISTS , and ARRAY\_CONTAINS.

    UPDATE `Your_project.your_dataset.TableA` AS a
    SET column3 = 'yes'
    WHERE EXISTS (
        SELECT 1
        FROM `Your_project.your_dataset.TableB` AS b,
             UNNEST(b.column1) AS b_col1,
             UNNEST(b.column2) AS b_col2,
             UNNEST(SPLIT(a.column2, ',')) AS a_col2 
        WHERE a.column1 = b_col1
          AND a_col2 IN (b_col2)
    );
    

    The EXISTS subquery efficiently checks for the existence of a matching row in TableB based on your conditions. Also, consider adding error handling to handle potential issues like invalid data types or missing columns. You might wanna ensure your tables have the correct data types for the columns involved in the comparison as well.