Search code examples
sqlpartition-problemset-based

SQL: Failing to bring back effective begin/end dates on order by/partition by query


I want to bring back a result set that returns the beginning effective date and the end effective date for an id with multiple supplier changes. To do this, I am looking at a transaction table that records the id, the supplier's id and the date the transaction occurred. In cases where the id has switched suppliers, I want to retire the old association and record the new association. My intent is to insert a new row with the latest switch date as the beginning effective date and a null as the end effective date. To complete the event, I want to update the last previous row with the end effective date populated with the latest switch date. In instances where I have a transaction, but the id hasn't switched suppliers I want to ignore the row.

What I have works for a single id, however, when I add the second id, the order/partitioning does not work.

Here is the script to generate the test rows. The sql that works for a single id is noted.

    -- Note: use this to emulate the known switched suppliers table

    create table #switched
    (lcard bigint);

    insert into #switched (lcard) values (700382)
    insert into #switched (lcard) values (832019)

    select * from #switched

    -- Note: this temp data represents a previously grouped/partitioned table
            --       prepped for this next phase of action

    create table #PartitionTest 
    (   FauxId int,
        lcard bigint,
        suppId int,
        switchDate datetime
    );


    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (1,700382,506,cast('Jun 23 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (2,700382,49401,cast('May 22 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (3,700382,49401,cast('May  4 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (4,700382,49401,cast('May  2 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (5,700382,49401,cast('Apr 26 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (6,700382,49401,cast('Mar 15 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (1,832019,27088,cast('Jun 18 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (2,832019,232,cast('May 24 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (3,832019,232,cast('May 23 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (4,832019,232,cast('May 22 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (5,832019,232,cast('May 21 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (6,832019,232,cast('May 17 2013 12:00AM' as datetime))
    INSERT INTO #PartitionTest (FauxId,lcard,suppId,switchDate) VALUES (7,832019,232,cast('May 16 2013 12:00AM' as datetime))

    -- Note: Order results by lcard, then order the supplier id by
        --       the transaction date found. FauxId is from the previous partitioning

    select * from #PartitionTest
     order by lcard,fauxId, suppId, switchDate

    -- This is the statement that is failing when attempting to utilize 
        -- the ids in #switched as the criterion processing sets of ids.

    ;with sifted
        as ( select *,
          row_number() over (ORDER BY switchDate) - row_number() over (PARTITION BY suppId ORDER BY switchDate) as G
        from #PartitionTest
        where lcard in
         (select lcard
            from #switched
          )
         -- // DEBUG TEST: specific card holder(s)
         -- NOTE: when both lcards are used, the beginEffDate/endEffDate goal fails
         -- and lcard in ('8320198','7003824')
         -- NOTE: this represents the intent
         and lcard in ('832019')
        ),
       refined as
         (select lcard,
                 suppId,
                MIN(switchDate) BeginEffDate,
                ROW_NUMBER() OVER (ORDER BY min(switchDate)) as OrgSplit
         from sifted
          group by lcard,suppId, G)
   select a.lcard,
          a.suppId,
          a.BeginEffDate,
          b.BeginEffDate as EndEffDate
    from refined a
          left join refined b
            on a.OrgSplit + 1 = b.OrgSplit
    order by a.lcard, a.suppId



    -- drop table #switched;
    -- drop table #PartitionTest;

==================================================================

EDITED

Here are the desired results:

enter image description here


Solution

  • In SQL Server 2012, I have the dense_rank() option rather than working with partition/over. The solution:

        WITH R AS (
        SELECT
            lcard,
            suppId,
            switchDate,
            DENSE_RANK() OVER(PARTITION BY lcard ORDER BY switchDate) -
            DENSE_RANK() OVER(PARTITION BY lcard ORDER BY suppId, switchDate) AS grp
        FROM
            #PartitionTest
        ),
        S AS (
        SELECT
            lcard,
            suppId,
            MAX(switchDate) AS dt,
            ROW_NUMBER() OVER(PARTITION BY lcard ORDER BY MAX(switchDate)) AS rn
        FROM
            R
        GROUP BY
            lcard,
            suppId,
            grp
        )
        SELECT
            A.lcard,
            A.suppId,
            A.dt AS BeginEffDate,
            B.dt AS EndEffDate
        FROM
            S AS A
            LEFT OUTER JOIN
            S AS B
            ON A.lcard = B.lcard
            AND A.rn = B.rn - 1
        ORDER BY
            lcard,
            BeginEffDate
        GO