I would like to compare two lists of strings using BigQuery.
DECLARE list ARRAY ; DECLARE list2 ARRAY ;
SET list1 = ['orange','red','black' ]; SET list2 = ['red','black' ];
AS a result, i would like to get 'orange' as it is the value missing in list2.
Can anyone help me please ?
I've tried this solution, but it takes too long. My lists actually contain hundreds of values, so I'd like something more efficient than adding 'as value union all select' in front of each value.
WITH list1 AS (
SELECT 'orange' AS value UNION ALL
SELECT 'red' AS value UNION ALL
SELECT 'black' AS value
)
,
list2 AS (
SELECT 'red' AS value UNION ALL
SELECT 'black' AS value
)
SELECT value
FROM (
SELECT value, 'list1' AS list_name
FROM list1
UNION ALL
SELECT value, 'list2' AS list_name
FROM list2
)
GROUP BY value
HAVING COUNT(*) = 1
ORDER BY value
Consider below (BigQuery Standard SQL)
DECLARE list1, list2 ARRAY<string>;
SET list1 = ['orange','red','black' ]; SET list2 = ['red','black' ];
SELECT * FROM (
SELECT DISTINCT el FROM UNNEST(list1) el
UNION ALL
SELECT DISTINCT el FROM UNNEST(list2) el
)
GROUP BY el
HAVING COUNT(*) = 1;
with output
Also consider below approach
DECLARE list1, list2 ARRAY<string>;
SET list1 = ['orange','red','black', 'green' ]; SET list2 = ['red','black', 'pink' ];
SELECT list, ARRAY_AGG(el) el FROM (
SELECT el, MIN(list) list FROM (
SELECT DISTINCT el, 'list1' list FROM UNNEST(list1) el
UNION ALL
SELECT DISTINCT el, 'list2' FROM UNNEST(list2) el
)
GROUP BY el
HAVING COUNT(*) = 1
)
GROUP BY list
with output