Search code examples
sqlamazon-redshiftwindow-functions

Get max value from a window of rows as new column for all rows


| col 1 | col 2 | col 3 |
|-------|-------|-------|
| 67458 | ADM   | 1008  |
| 67458 | ADM   | 1009  |
| 67458 | SKI   | 1009  |
| 67458 | LIS   | 1010  |
| 67458 | TOU   | 1121  |

How to get max value of col3 when col2='ADM' and use that as the value for rest of the records?

Expected Result:

| col 1 | col 2 | col 3 | col 4 |
|-------|-------|-------|-------|
| 67458 | ADM   | 1008  | 1009  |
| 67458 | ADM   | 1009  | 1009  |
| 67458 | SKI   | 1009  | 1009  |
| 67458 | LIS   | 1010  | 1009  |
| 67458 | TOU   | 1121  | 1009  |

I know how to do this with sub-selects and all. col4 will be a pseudo-column to be used downstream for JOINs and stuff.

I have tried the following but it populates 1121 instead of 1009:

MAX(col3) OVER (PARTITION BY col1 (CASE WHEN col2='ADM' THEN col2 END) ORDER BY col1)

Solution

  • SELECT t.*, max(CASE WHEN col2 = 'ADM' THEN col3 END) OVER (PARTITION BY col1) AS col4
    FROM   tbl t;
    

    The aggregate FILTER clause was introduced in PostgreSQL 9.4 and would be nice for this. But Redshift is not Postgres, and most of the later additions in Postgres are unsupported there. Compare: