Search code examples
sqldistinctamazon-redshiftdense-rank

The alias name RANK() function is not recognized in the where clause with DISTINCT columns


I have 2 tables with columns (customer, position, product ,sales_cycle, call_count , cntry_cd , owner_cd , cr8) and I am facing some challenges as mentioned below Kindly please help me to fix this

My Requirement

I have 2 tables test.table1 and test.table2

I need to insert values form "test.table2" by doing an select with "test.table1". But I am facing a problem i.e. I am getting some duplicates while loading data to "test.table2"

I have totally 8 columns in both the table but while loading I need to take the highest rank of the column "call_count" with condition of unique values of these columns (customer, position, product ,sales_cycle)

Query what I tried

select 
distinct (customer, position, product ,sales_cycle), 
rank () over (order by call_count desc) rnk, 
cntry_cd, 
owner_cd, 
cr8 
from test.table1 
where rnk=1

I am facing few challenges in the above query (The database I am using is RedShift)

1.I can't do distinct for only few columns

2.The alias name "rnk" is not recognized in the where clause

Kindly please help me to fix this , Thanks


Solution

  • You can't use a column alias on the same level where it's introduced. You need to wrap the query in a derived table. The distinct as shown is useless as well if you use rank()

    select customer, position, product, sales_cycle, 
           cntry_cd, owner_cd, cr8 
    from (
      select customer, position, product, sales_cycle, 
             cntry_cd, owner_cd, cr8,
             rank () over (order by call_count desc) rnk
      from test.table1 
    ) t 
    where rnk=1;
    

    The derived table adds no overhead to the processing time. In this case it is merely syntactic sugar to allow you to reference the column alias.