Search code examples
sqlhadoophivehiveqlqubole

Exclude records with certain values in Qubole


Using Qubole

I have

Table A (columns in json parsed...)

ID  Recommendation    Decision
1     GOOD            GOOD
2     BAD             BAD
2     GOOD            BAD
3     GOOD            BAD
4     BAD             GOOD
4     GOOD            BAD

I need to Select only IDs which have Recommendation GOOD but Decision BAD. Therefore output should be 3.

I tried :

SELECT a.ID  
FROM (
select json_parsed['ID'] as ID
,json_parsed["Decision"] as Decision
,json_parsed["Recommendation"] as Recommendation
from  A  
where create_date >= '2020-11-18') a
Left JOin
(select json_parsed['ID'] as ID
,json_parsed["Decision"] as Decision
,json_parsed["Recommendation"] as Recommendation
from  A
where create_date >= '2020-11-18') as b on a.ID = b.ID and b.Recommendation = "GOOD"
Where
b.Recommendation is NULL

Solution

  • Use analytic functions.

    Demo:

    with your_table as (--use your table instead of this sample
    select stack(6,
    1,'GOOD','GOOD',
    2,'BAD','BAD'  ,
    2,'GOOD','BAD' ,
    3,'GOOD','BAD' ,
    4,'BAD','GOOD' ,
    4,'GOOD','BAD') as (ID,Recommendation,Decision)
    )
    
    select ID,Recommendation,Decision
    from
    (
    select d.*, 
           count(*) over(partition by id) as cnt,
           count(case when Recommendation  = 'GOOD' then 1 end) over(partition by id) cnt_Recommendation_good, 
           count(case when Decision  = 'BAD' then 1 end) over(partition by id) cnt_Decision_BAD
      from 
      your_table d
    ) s
    where cnt_Recommendation_good=cnt
      and cnt_Decision_BAD = cnt
    

    Result:

    id  recommendation  decision
    3   GOOD            BAD