When i run this query in data bricks i get the following error
[INVALID_SUBQUERY_EXPRESSION.SCALAR_SUBQUERY_RETURN_MORE_THAN_ONE_OUTPUT_COLUMN] Invalid subquery: Scalar subquery must return only one column, but got 3. SQLSTATE: 42823
`select
all.a,
all.b,
all.c
from com a11
join fam_grp a12
on ((a11.mkt_id || a11.brd_id) = (a12.mkt_id || a12.brd_id) and
a11.dataset = a12.dataset)
where ((a11.cust_id,
a11.mkt_id,
a11.o_cust_id)
in (((select pc21.cust_id,
pc21.mkt_id_1,
pc21.o_cust_id
from ZZMD00 pc21
group by pc21.cust_id,
pc21.mkt_id_1,
pc21.o_cust_id)
intersect (select pc21.cust_id,
pc21.mkt_id_1,
pc21.o_cust_id
from ZZMD01 pc21
group by pc21.cust_id,
pc21.mkt_id_1,
pc21.o_cust_id)))
and a11.dataset in ('C'))
group by
all.a,
all.b,
all.c`
I believe i get the issue after the where clause using a sub query, i guess that is causing the error.
You can't use row-comparators in your DBMS, so IN
with multiple columns doesn't work.
You need to use EXISTS
with either INTERSECT
or an inner WHERE
filter.
Also, given that you are using an EXISTS
or IN
, using GROUP BY
in the subquery doesn't make much sense anyway.
select
all.a,
all.b,
all.c
from com a11
join fam_grp a12
on a11.mkt_id = a12.mkt_id
and a11.brd_id = a12.brd_id
and a11.dataset = a12.dataset
where exists (
select
a11.cust_id,
a11.mkt_id,
a11.o_cust_id
intersect
select
pc21.cust_id,
pc21.mkt_id_1,
pc21.o_cust_id
from ZZMD00 pc21
intersect
select
pc21.cust_id,
pc21.mkt_id_1,
pc21.o_cust_id
from ZZMD01 pc21
)
and a11.dataset in ('C')
group by
all.a,
all.b,
all.c;
Alternatively
select
all.a,
all.b,
all.c
from com a11
join fam_grp a12
on a11.mkt_id = a12.mkt_id
and a11.brd_id = a12.brd_id
and a11.dataset = a12.dataset
where exists (select 1
from ZZMD00 pc21
where a11.cust_id = pc21.cust_id
and a11.mkt_id = pc21.mkt_id
and a11.o_cust_id = pc21.o_cust_id
intersect
select
pc21.cust_id,
pc21.mkt_id_1,
pc21.o_cust_id
from ZZMD01 pc21
)
and a11.dataset in ('C')
group by
all.a,
all.b,
all.c;
You can also do this as separate where
predicates.
select
all.a,
all.b,
all.c
from com a11
join fam_grp a12
on a11.mkt_id = a12.mkt_id
and a11.brd_id = a12.brd_id
and a11.dataset = a12.dataset
where exists (select 1
from ZZMD00 pc21
where a11.cust_id = pc21.cust_id
and a11.mkt_id = pc21.mkt_id
and a11.o_cust_id = pc21.o_cust_id
)
and exists (select 1
from ZZMD01 pc21
where a11.cust_id = pc21.cust_id
and a11.mkt_id = pc21.mkt_id
and a11.o_cust_id = pc21.o_cust_id
)
and a11.dataset in ('C')
group by
all.a,
all.b,
all.c;
I'm not sure why you are concatting mkt_id
and brd_id
, but it's unlikely to be correct or performant.