Search code examples
sqlsql-server-2008mintransitivity

SQL Server Query to find min date based on transitive relationship


Running into a bit of a headache with this one query and some hints/suggestions would be greatly appreciated. I couldn't find anything really related to my problem -I found some stuff on transitive closures which isn't quite what I need since my data can possibly create a loop/cycle which, I think, would cause a recursive call to infinitely loop.

Say I have two basic tables with the data displayed below them. Full disclosure: the Memberships table is a CTE which has already done a fair bit of logic to calcualte the CServiceDate column value. The Transfers table is an actual table which doesn't contain a whole lot other than a PK and FromMembershipID and ToMembershipID relationships.

Memberships

 
+==========+==============+================+==============+
| PersonID | MemberShipID | MembershipDate | CServiceDate |
+==========+==============+================+==============+
|        1 |           15 | Aug-01-2016    | Aug-27-2017  |
+----------+--------------+----------------+--------------+
|        1 |           16 | Mar-25-2016    | Sep-01-2000  |
+----------+--------------+----------------+--------------+
|        1 |           17 | Dec-06-2011    | May-15-1995  |
+----------+--------------+----------------+--------------+
|        1 |           18 | Jan-12-2009    | Feb-28-1998  |
+----------+--------------+----------------+--------------+
|        1 |           19 | Apr-08-2016    | Jul-10-1994  |
+----------+--------------+----------------+--------------+
|        1 |           20 | Jun-11-2010    | Nov-12-1997  |
+----------+--------------+----------------+--------------+

Transfer

+=====+=====+==================+================+
| TID | PID | FromMembershipID | ToMembershipID |
+=====+=====+==================+================+
|   1 |   1 |               16 |            15  |
+-----+-----+------------------+----------------+
|   2 |  1  |               18 |             17 |
+-----+-----+------------------+----------------+
|   3 |   1 |               19 |            17  |
+-----+-----+------------------+----------------+
|   4 |   1 |               20 |            18  |
+-----+-----+------------------+----------------+
|   5 |   1 |               20 |            19  |
+-----+-----+------------------+----------------+

Problem What I need from a query is for each row in the Memberships CTE (i.e. for each MembershipID), I want to return the MIN CServiceDate for all related MembershipIDs. I will call this MIN value the ECSD (Expected Credited Service Date). The calculation of the ECSD has only two conditions:

  • A membership record is considered "related" to the current MembershipID if it somehow transitively tied to the current MembershipID via the Transfers table by looking at the FromMembershipID and ToMembershipID columns. e.g. For MembershipID 20, if we look at the Transfers table, we can see that MembershipIDs 20, 19, 18, and 17 are all related through transitivity (aside: via the Transfers table, we can see that MembershipIDs 15 and 16 are related to one another, but not to [20,19,18,17] )
  • Of the transitively-related membership listing derived directly above, the only memberships that can be considered in the transitive relationship listing when calculating the ECSD are the Memberships with a MembershipDate EARLIER than the current MembershipID. e.g. Based on the MembershipDates for a given MembershipID and the Transfers table, if looking at MembershipID 17, MembershipID 19 cannot be considered when calculating the ECSD for MembershiID 17 since MembershipID 19 has a MembershipDate (Apr-08-2016) that is NOT earlier than MembershipID 17's (Dec-06-2011)

Expected Output of the ECSD column

+==========+==============+================+==============+=============+
| PersonID | MemberShipID | MembershipDate | CServiceDate |    ECSD     |
+==========+==============+================+==============+=============+
|        1 |           15 | Aug-01-2016    | Aug-27-2017  | Sep-01-2000 |
+----------+--------------+----------------+--------------+-------------+
|        1 |           16 | Mar-25-2016    | Sep-01-2000  | Sep-01-2000 |
+----------+--------------+----------------+--------------+-------------+
|        1 |           17 | Dec-06-2011    | May-15-1995  | May-15-1995 |
+----------+--------------+----------------+--------------+-------------+
|        1 |           18 | Jan-12-2009    | Feb-28-1998  | Feb-28-1998 |
+----------+--------------+----------------+--------------+-------------+
|        1 |           19 | Apr-08-2016    | Jul-10-1994  | Jul-10-1994 |
+----------+--------------+----------------+--------------+-------------+
|        1 |           20 | Jun-11-2010    | Nov-12-1997  | Nov-12-1997 |
+----------+--------------+----------------+--------------+-------------+

Please Note:

  • A MembershipID can show up multiples times in the FromMembershipID and ToMembershipID columns. These column values do not have to be unique.
  • The tables provided above are simple relationships. The Memberships/Transfers can go as little as 1-tier deep (i.e. 1 transfer) to N-tiers deep where the transfers (if by just looking at the From/ToMembershipIDs) can create a cyclic loop. Below is an example with a deeper transfer relationship.
  • It may help to note that if a MembershipID does NOT exist in the TOMembershipID column, that MembershipID is the earliest (there may be more than one). Similarly, if a MembershipID does NOT exist in the FromMembershipID column, it is the end and most current membership.
    • There may be blocks or sets of transfers that relate some memberships together for a given person, but not all.

Example 2

Memberships

+==========+==============+================+==============+
| personid | membershipid | membershipdate | CServiceDate |
+==========+==============+================+==============+
|   499510 |       548426 | 2014-09-29     | 2014-09-29   |
+----------+--------------+----------------+--------------+
|   499510 |       548428 | 2014-01-29     | 2014-01-29   |
+----------+--------------+----------------+--------------+
|   499510 |       548425 | 2012-05-28     | 2012-05-28   |
+----------+--------------+----------------+--------------+
|   499510 |       548429 | 2011-11-23     | 2011-11-23   |
+----------+--------------+----------------+--------------+
|   499510 |       548427 | 2008-07-03     | 2008-07-03   |
+----------+--------------+----------------+--------------+
|   499510 |       548431 | 2001-05-01     | 1976-01-01   |
+----------+--------------+----------------+--------------+
|   499510 |       548430 | 1998-10-08     | 1998-10-08   |
+----------+--------------+----------------+--------------+

Transfers

+=======+========+==================+================+
|  tid  |  pid   | FromMembershipID | ToMembershipID |
+=======+========+==================+================+
| 10664 | 499510 |           548430 |         548431 |
+-------+--------+------------------+----------------+
| 10665 | 499510 |           548431 |         548427 |
+-------+--------+------------------+----------------+
| 10666 | 499510 |           548427 |         548429 |
+-------+--------+------------------+----------------+
| 10667 | 499510 |           548429 |         548425 |
+-------+--------+------------------+----------------+
| 10668 | 499510 |           548425 |         548428 |
+-------+--------+------------------+----------------+
| 10669 | 499510 |           548428 |         548426 |
+-------+--------+------------------+----------------+
| 13085 | 499510 |           548430 |         548427 |
+-------+--------+------------------+----------------+
| 13086 | 499510 |           548427 |         548425 |
+-------+--------+------------------+----------------+
| 13087 | 499510 |           548425 |         548426 |
+-------+--------+------------------+----------------+
| 13088 | 499510 |           548431 |         548429 |
+-------+--------+------------------+----------------+
| 13089 | 499510 |           548429 |         548428 |
+-------+--------+------------------+----------------+

Expected Outcome

+==========+==============+================+==============+============+
| personid | membershipid | membershipdate | CServiceDate |    ECSD    |
+==========+==============+================+==============+============+
|   499510 |       548426 | 2014-09-29     | 2014-09-29   | 1976-01-01 |
+----------+--------------+----------------+--------------+------------+
|   499510 |       548428 | 2014-01-29     | 2014-01-29   | 1976-01-01 |
+----------+--------------+----------------+--------------+------------+
|   499510 |       548425 | 2012-05-28     | 2012-05-28   | 1976-01-01 |
+----------+--------------+----------------+--------------+------------+
|   499510 |       548429 | 2011-11-23     | 2011-11-23   | 1976-01-01 |
+----------+--------------+----------------+--------------+------------+
|   499510 |       548427 | 2008-07-03     | 2008-07-03   | 1976-01-01 |
+----------+--------------+----------------+--------------+------------+
|   499510 |       548431 | 2001-05-01     | 1976-01-01   | 1976-01-01 |
+----------+--------------+----------------+--------------+------------+
|   499510 |       548430 | 1998-10-08     | 1998-10-08   | 1998-10-08 |
+----------+--------------+----------------+--------------+------------+

Thanks!


Solution

  • Here's an approach that uses CTEs to compile a list of transitive relationships. One CTE defines the relationships recursively from FromMembershipID to ToMembershipID. The 2nd CTE does the same thing in the opposite direction - that way, you can combine the results into a 3rd CTE that lists every transitive relationship between memberships.

    For instance, in the first result set, 20 => 19, and 19 => 17, and in reverse, 17 => 19 and 19 => 20, so you get:

    20   19
    20   17
    19   17
    17   19
    17   20
    19   20
    

    Then you can use that list transitive relationships to get the min CServiceDate by membership.

    This gives the correct output for both sets of sample data. You'll probably run into the 100 level recursion limit if you have much more data, but you can configure that. Or find a better solution :) but technically, this works:

    ;with cte_trans_relationships (frommembershipid, tomembershipid)
    as
    (
        select frommembershipid, tomembershipid
        from @transfer
        union all
        select c.frommembershipid, t.tomembershipid
        from cte_trans_relationships c
        inner join @transfer t on c.tomembershipid = t.frommembershipid
    ),
    cte_trans_relationships2 (tomembershipid, frommembershipid)
    as
    (
        select tomembershipid, frommembershipid
        from @transfer
        union all
        select c.tomembershipid, t.frommembershipid
        from cte_trans_relationships c
        inner join @transfer t on c.frommembershipid = t.tomembershipid
    ),
    cte_trans_all (m1, m2)
    as
    (
        select distinct frommembershipid m1, tomembershipid m2
        from cte_trans_relationships c
        union 
        select distinct tomembershipid, frommembershipid
        from cte_trans_relationships2 c2    
    )
    select m.personid, m.membershipid, m.membershipdate, m.cservicedate, 
        case when sq.cservicedate < cservicedate1 and sq.cservicedate < cservicedate2 then sq.cservicedate
        when cservicedate1 < sq.cservicedate and cservicedate1 < cservicedate2 then cservicedate1
        else cservicedate2 end as ECSD
    from @memberships m
    inner join 
    (
        select m.personid, m.membershipid, isnull(m.cservicedate, dateadd(year, 100, getdate())) as cservicedate, 
        isnull(min(m1.cservicedate), dateadd(year, 100, getdate())) as cservicedate1,
        isnull(min(m2.cservicedate), dateadd(year, 100, getdate())) as cservicedate2
        from @memberships m
        left join cte_trans_all sq_trans1 on m.membershipid = sq_trans1.m1
        left join @memberships m1 on sq_trans1.m2 = m1.membershipid and m1.membershipdate < m.membershipdate
        left join cte_trans_all sq_trans2 on m.membershipid = sq_trans2.m1
        left join @memberships m2 on sq_trans2.m2 = m2.membershipid and m2.membershipdate < m.membershipdate
        group by m.personid, m.membershipid, m.cservicedate
    )sq on m.personid = sq.personid and m.membershipid = sq.membershipid
    

    Here are sample DDL/DML statements for testing:

    declare @memberships table (personid int, membershipid int, membershipdate datetime, cservicedate datetime)
    insert into @memberships values (1, 15, '8/1/2016', '8/27/2017')
    insert into @memberships values (1, 16, '3/25/2016', '9/1/2000')
    insert into @memberships values (1, 17, '12/6/2011', '5/15/1995')
    insert into @memberships values (1, 18, '1/12/2009', '2/28/1998')
    insert into @memberships values (1, 19, '4/8/2016', '7/10/1994')
    insert into @memberships values (1, 20, '6/11/2010', '11/12/1997')
    --insert into @memberships values (499510, 548426, '9/29/2014', '9/29/2014')
    --insert into @memberships values (499510, 548428, '1/29/2014', '1/29/2014')
    --insert into @memberships values (499510, 548425, '5/28/2012', '5/28/2012')
    --insert into @memberships values (499510, 548429, '11/23/2011', '11/23/2011')
    --insert into @memberships values (499510, 548427, '7/3/2008', '7/3/2008')
    --insert into @memberships values (499510, 548431, '5/1/2001', '1/1/1976')
    --insert into @memberships values (499510, 548430, '10/8/1998', '10/8/1998')
    
    declare @transfer table (tid int, pid int, frommembershipid int, tomembershipid int)
    insert into @transfer values (1, 1, 16, 15)
    insert into @transfer values (2, 1, 18, 17)
    insert into @transfer values (3, 1, 19, 17)
    insert into @transfer values (4, 1, 20, 18)
    insert into @transfer values (5, 1, 20, 19)
    --insert into @transfer values (10664, 499510, 548430, 548431)
    --insert into @transfer values (10665, 499510, 548431, 548427)
    --insert into @transfer values (10666, 499510, 548427, 548429)
    --insert into @transfer values (10667, 499510, 548429, 548425)
    --insert into @transfer values (10668, 499510, 548425, 548428)
    --insert into @transfer values (10669, 499510, 548428, 548426)
    --insert into @transfer values (13085, 499510, 548430, 548427)
    --insert into @transfer values (13086, 499510, 548427, 548425)
    --insert into @transfer values (13087, 499510, 548425, 548426)
    --insert into @transfer values (13088, 499510, 548431, 548429)
    --insert into @transfer values (13089, 499510, 548429, 548428)
    

    Here are the 2 result sets:

    personid  membershipid  membershipdate    cservicedate  ECSD
    1         15            2016-08-01        2017-08-27    2000-09-01
    1         16            2016-03-25        2000-09-01    2000-09-01
    1         17            2011-12-06        1995-05-15    1995-05-15
    1         18            2009-01-12        1998-02-28    1998-02-28
    1         19            2016-04-08        1994-07-10    1994-07-10
    1         20            2010-06-11        1997-11-12    1997-11-12
    
    personid    membershipid    membershipdate  cservicedate    ECSD
    499510      548425          2012-05-28      2012-05-28      1976-01-01
    499510      548426          2014-09-29      2014-09-29      1976-01-01
    499510      548427          2008-07-03      2008-07-03      1976-01-01
    499510      548428          2014-01-29      2014-01-29      1976-01-01
    499510      548429          2011-11-23      2011-11-23      1976-01-01
    499510      548430          1998-10-08      1998-10-08      1998-10-08
    499510      548431          2001-05-01      1976-01-01      1976-01-01