Search code examples
sqljointeradatagaps-and-islands

join table to Gap and Islands SQL Teradata


I have data stored on the DB as gaps and islands and I need to join it with another table. the gaps and islands table like the below:

|Subs_ID|ORIGINAL_STATUS|NEW_STATUS|CHANGE_DATE|
|-------+---------------+----------+-----------|
|123456 |1              |2         |12/2/2017  |
|123456 |2              |3         |12/8/2019  |
|123456 |3              |4         |12/18/2019 |
|123456 |4              |8         |12/28/2019 |
|123456 |8              |9         |10/4/2020  |

the second table only includes the Subs_ID and sequenced Connect_date

|Subs_ID|CONNECT_DATE|
|-------+------------|
|123456 |12/1/2017   |
|123456 |12/3/2017   |
|123456 |11/4/2018   |
|123456 |10/5/2019   |
|123456 |12/30/2019  |
|123456 |10/4/2020   |
|123456 |5/21/2021   |

I need to join the current STATUS from the first table to the second one using the subs_id and the dates. and the result will be the below where if the Connect_date less than the Change_Date will take the first ORIGINAL_STATUS and vs if the Connect_date more than the Change_Date will take the last NEW_STATUS

|Subs_ID|CONNECT_DATE|Status|
|-------+------------+------|
|123456 |12/1/2017   |1     |
|123456 |12/3/2017   |2     |
|123456 |11/4/2018   |2     |
|123456 |10/5/2019   |2     |
|123456 |12/30/2019  |8     |
|123456 |10/4/2020   |8     |
|123456 |5/21/2021   |9     |

Solution

  • I hope this query helps you. Code breakdown and explanation given below

    select a.*,B.ORIGINAL_STATUS
    from mytable2 a
    join (select *,LAG(CHANGE_DATE) OVER(ORDER BY CHANGE_DATE) as previous_date_value
    from mytable)b
    on a.Subs_ID =b.Subs_ID and a.CONNECT_DATE >b.previous_date_value and a.CONNECT_DATE<= b.CHANGE_DATE
    
    UNION
    
    select * from (
    select a.*,CASE WHEN A.CONNECT_dATE<B.CHANGE_DATE THEN b.ORIGINAL_STATUS ELSE NULL END AS STATUS
    from mytable2 a
    join (
    select Subs_ID,ORIGINAL_STATUS,NEW_STATUS,CHANGE_DATE from
    (
    select Subs_ID,ORIGINAL_STATUS,NEW_STATUS,CHANGE_DATE,row_number() over(partition by Subs_ID order by CHANGE_DATE)rn
    from mytable
    )mytable_less
    where rn=1)b
    on a.Subs_ID=b.Subs_ID)less_than_available_date
    where status is not null
    
    UNION
    
    select * from (
    select a.*,CASE WHEN A.CONNECT_dATE>B.CHANGE_DATE THEN b.NEW_STATUS ELSE NULL END AS STATUS
    from mytable2 a
    join (
    select Subs_ID,ORIGINAL_STATUS,NEW_STATUS,CHANGE_DATE from
    (
    select Subs_ID,ORIGINAL_STATUS,NEW_STATUS,CHANGE_DATE,row_number() over(partition by Subs_ID order by CHANGE_DATE desc)rn
    from mytable
    )mytable_less
    where rn=1)b
    on a.Subs_ID=b.Subs_ID)greater_than_available_date
    where status is not null
    

    Explanation: First block of code with return the status of the connect date if the connect date is present with in the range of available change dates. I am using lag function to fetch previous change date value and comparing with connect date.

    select a.*,B.ORIGINAL_STATUS
    from mytable2 a
    join (select *,LAG(CHANGE_DATE) OVER(ORDER BY CHANGE_DATE) as previous_date_value
    from mytable)b
    on a.Subs_ID =b.Subs_ID and a.CONNECT_DATE >b.previous_date_value and a.CONNECT_DATE<= b.CHANGE_DATE
    

    This gives us the following result set

    +---------+--------------+-----------------+
    | Subs_ID | CONNECT_DATE | ORIGINAL_STATUS |
    +---------+--------------+-----------------+
    |  123456 | 2017-12-03   |               2 |
    |  123456 | 2018-11-04   |               2 |
    |  123456 | 2019-10-05   |               2 |
    |  123456 | 2019-12-30   |               8 |
    |  123456 | 2020-10-04   |               8 |
    +---------+--------------+-----------------+
    

    Now we want to find the status of 12/1/2017 which is lesser than the available change date. table 2 is joined with table 1 which has the row of minimum change date and ORIGINAL_STATUS is taken if the connect date is lesser than change date..

    select * from (
    select a.*,CASE WHEN A.CONNECT_dATE<B.CHANGE_DATE THEN b.ORIGINAL_STATUS ELSE NULL END AS STATUS
    from mytable2 a
    join (
    select Subs_ID,ORIGINAL_STATUS,NEW_STATUS,CHANGE_DATE from
    (
    select Subs_ID,ORIGINAL_STATUS,NEW_STATUS,CHANGE_DATE,row_number() over(partition by Subs_ID order by CHANGE_DATE)rn
    from mytable
    )mytable_less
    where rn=1)b
    on a.Subs_ID=b.Subs_ID)less_than_available_date
    where status is not null
    
    +---------+--------------+--------+
    | Subs_ID | CONNECT_DATE | STATUS |
    +---------+--------------+--------+
    |  123456 | 2017-12-01   |      1 |
    +---------+--------------+--------+
    

    Left over is the records with connect date greater than available change date. This is achieved by below code and table 2 is joined with table 1 having the row of max change date and new status is taken if the connect date is greater than change date.

    select * from (
    select a.*,CASE WHEN A.CONNECT_dATE>B.CHANGE_DATE THEN b.NEW_STATUS ELSE NULL 
    END AS STATUS
    from mytable2 a
    join (
    select Subs_ID,ORIGINAL_STATUS,NEW_STATUS,CHANGE_DATE from
    (
    select Subs_ID,ORIGINAL_STATUS,NEW_STATUS,CHANGE_DATE,row_number() 
    over(partition by Subs_ID order by CHANGE_DATE desc)rn
    from mytable
    )mytable_less
    where rn=1)b
    on a.Subs_ID=b.Subs_ID)greater_than_available_date
    where status is not null
    
        +---------+--------------+--------+
        | Subs_ID | CONNECT_DATE | STATUS |
        +---------+--------------+--------+
        |  123456 | 2021-05-21   |      9 |
        +---------+--------------+--------+
    

    Finally after applying union we get the required resultset

    +---------+--------------+-----------------+
    | Subs_ID | CONNECT_DATE | ORIGINAL_STATUS |
    +---------+--------------+-----------------+
    |  123456 | 2017-12-01   |               1 |
    |  123456 | 2017-12-03   |               2 |
    |  123456 | 2018-11-04   |               2 |
    |  123456 | 2019-10-05   |               2 |
    |  123456 | 2019-12-30   |               8 |
    |  123456 | 2020-10-04   |               8 |
    |  123456 | 2021-05-21   |               9 |
    +---------+--------------+-----------------+