Search code examples
sqlteradatapartition-by

Muliple "level" conditions on partition by SQL


I have to populate a teradata table from another source where that can be simplify like that:

+------+------+------------+------------+
| Col1 | Col2 |    Col3    |    Col4    |
+------+------+------------+------------+
| 1234 |    0 | 01/01/2009 | 01/04/2019 |
| 1234 |    3 | 01/01/2010 | 01/05/2020 |
| 2345 |    1 | 20/02/2013 | 01/04/2019 |
| 2345 |    0 | 20/02/2013 | 01/04/2018 |
| 2345 |    2 | 31/01/2009 | 01/04/2017 |
| 3456 |    0 | 01/01/2009 | 01/04/2019 |
| 3456 |    1 | 01/01/2015 | 01/04/2019 |
| 3456 |    1 | 01/01/2015 | 01/05/2017 |
| 3456 |    3 | 01/01/2015 | 01/04/2019 |
+------+------+------------+------------+

Col1 is duplicated in source so we have rules to select the right row (with col1 unique in final result) For if value in col1 :

  • If value is duplicated then select the most recent date in Col3
  • If (and only if) it is still duplicated then select row with col2=1
  • If still duplicated then select most recent date in col4.

Considering the the previous table we should get the following result :

+------+------+------------+------------+
| Col1 | Col2 |    Col3    |    Col4    |
+------+------+------------+------------+
| 1234 |    3 | 01/01/2010 | 01/05/2020 |
| 2345 |    1 | 20/02/2013 | 01/04/2019 |
| 3456 |    1 | 01/01/2015 | 01/04/2019 |
+------+------+------------+------------+

I start using partition by to group each value occurrences in col 3 but i have no good idea on how to apply the conditions for each partion in a sql query

Thank you for your help


Solution

  • You can use QUALIFY in Teradata to simplify the syntax:

    SELECT col1, col2, col3, col4
    FROM mytable
    QUALIFY ROW_NUMBER() OVER(
      PARTITION BY col1 -- Group rows by "col1" values
      ORDER BY col3 DESC, CASE WHEN col2 = 1 THEN 1 ELSE 2 END, col4 DESC -- Order rows
    ) = 1 -- Get "first" row in each group
    

    Otherwise, this is the same as the answer above.