Search code examples
sqlvertica

vsql/Vertica: Select top 5 rows by group


I have this query that it's supposed to take first n rows of grouped data. I used RANK() in combination with OVER PARTITION BY to identify top n rows per group:

SELECT X.USERID, X.ARTID, X.AVGTIMEONPAGE,EDP.ARTDSC,
RANK() OVER (PARTITION BY X.USERID ORDER BY X.AVGTIMEONPAGE DESC) as rank
FROM
(SELECT GANG.userID AS USERID,GANG.avgTimeOnPage AS AVGTIMEONPAGE,   
split_part(GANG.pageTitle,' -',1) as ARTID
FROM GoogleAnalytics.navigazioneG AS GANG
WHERE GANG.pagePath LIKE '%DataSheets%' ) AS X
LEFT JOIN ESPDDS.ESP_DPRODUCT AS EDP
ON EDP.ARTID=X.ARTID AND EDP.SCD_IS_CURRENT=1
AND EDP.COMPANYID=1
WHERE X.ARTID NOT LIKE '%Company%' AND rank in (1,2,3,4,5)

it gives me an error saying that rank column doesn't exist. If I comment the last part of WHERE Clause I can see that column rank is computed correctly.

Thanks


Solution

  • The WHERE clause gets evaluated before the SELECT clause. So at that time rank is unknown. You can use a further subquery to access it:

    SELECT *
    FROM
    (
      SELECT 
        X.USERID, 
        X.ARTID, 
        X.AVGTIMEONPAGE,
        EDP.ARTDSC,
        RANK() OVER (PARTITION BY X.USERID ORDER BY X.AVGTIMEONPAGE DESC) as rank
      FROM
      (
        SELECT 
          GANG.userID AS USERID,
          GANG.avgTimeOnPage AS AVGTIMEONPAGE,   
          split_part(GANG.pageTitle,' -',1) as ARTID
        FROM GoogleAnalytics.navigazioneG AS GANG
        WHERE GANG.pagePath LIKE '%DataSheets%' 
      ) AS X
      LEFT JOIN ESPDDS.ESP_DPRODUCT AS EDP ON EDP.ARTID = X.ARTID
                                          AND EDP.SCD_IS_CURRENT = 1 
                                          AND EDP.COMPANYID = 1
      WHERE X.ARTID NOT LIKE '%Company%' 
    ) ranked
    WHERE rank in (1,2,3,4,5);