Let's assume I've 3 columns in a table with values like this:
table_1:
A | B | C
-----------------------
'xx' | '' | 'y'
'x' | 'y' | 'x'
'x' | 'x' | 'y'
'x' | 'yy' | ''
'x' | '' | 'yy'
'x' | 'y' | 'y'
I've a result set (result of an SQL SELECT statement) which I want to identify in the above table if it exists there:
[
('x', 'x', 'y')
('x', 'y', 'y')
]
This result set would match for 5 (of 6) rows in instead of the 2 from the table above if I've compared the results of simple string concatenation, e.g. I would simply compare the results of this: SELECT concat(A, B, C) FROM table_1
I could solve this problem with comparing the results of more complex string concatenation functions like this: SELECT concat('A=', A, '_', 'B=', B, '_', 'C=', C )
BUT:
_
or =
xx_C=yy
Question:
Is it possible to solve somehow this problem without using string concatenation? Is there a simple solution for this multi column value checking problem?
I want to solve this in BiqQuery, but I'm interested in a general solution for every relational databse/datawarehouse.
Thank you!
CREATE TABLE test.table_1 (
A STRING,
B STRING,
C STRING
) AS
SELECT * FROM (
SELECT 'xx', '', 'y'
UNION ALL
SELECT 'x', 'y', 'x'
UNION ALL
SELECT 'x', 'x', 'y'
UNION ALL
SELECT 'x', 'yy', ''
UNION ALL
SELECT 'x', '', 'yy'
UNION ALL
SELECT 'x', 'y', 'y'
)
SELECT A, B, C
FROM test.table_1
WHERE (A, B, C) IN ( -> I need this functionality
SELECT 'x', 'x', 'y'
UNION ALL
SELECT 'x', 'y', 'y'
);
Below is the most generic way I can think of (BigQuery Standard SQL):
#standardSQL
SELECT *
FROM `project.test.table1` t
WHERE t IN (
SELECT t
FROM `project.test.table2` t
)
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.test.table1` AS (
SELECT 'xx' a, '' b, 'y' c UNION ALL
SELECT 'x', 'y', 'x' UNION ALL
SELECT 'x', 'x', 'y' UNION ALL
SELECT 'x', 'yy', '' UNION ALL
SELECT 'x', '', 'yy' UNION ALL
SELECT 'x', 'y', 'y'
), `project.test.table2` AS (
SELECT 'x' a, 'x' b, 'y' c UNION ALL
SELECT 'x', 'y', 'y'
)
SELECT *
FROM `project.test.table1` t
WHERE t IN (
SELECT t
FROM `project.test.table2` t
)
with output
Row a b c
1 x x y
2 x y y