i have a table with billions of records which is holding Merchandise information as follows. ID is bigint with auto increment. Article, UPC and MRP is actual data. DataDate is having the information like from which date this MRP is applicable.
ID Article EANUPC MRP DataDate
8546417 20171554001 1220636 599 20/11/2015
18589213 20171554001 1220636 599 15/12/2017
18655485 20171554001 1220636 390 26/12/2017
18784953 20171554001 1220636 390 11/1/2018
18833697 20171554001 1220636 290 16/1/2018
18954190 20171554001 1220636 290 24/1/2018
19060047 20171554001 1220636 190 30/1/2018
19116702 20171554001 1220636 90 6/2/2018
20107113 20171554001 1220636 90 13/6/2018
20143100 20171554001 1220636 599 13/6/2018
i want to merge the records in such a way that if MRP is same for subsequent days. then it should show valid from and Valid To date for that MRP.
i require following output.
Article EANUPC MRP FromDate ToDate
20171554001 1220636 599 20/11/2015 25/12/2017
20171554001 1220636 390 26/12/2017 15/01/2018
20171554001 1220636 290 16/1/2018 29/01/2018
20171554001 1220636 190 30/1/2018 05/02/2018
20171554001 1220636 90 6/2/2018 12/06/2018
20171554001 1220636 599 13/6/2018 24/08/2018
Please help me to achieve this.
You can treat this as a gaps and islands problem. Using:
SELECT Article, EANUPC, MRP, DataDate,
ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
FROM mytable
ORDER BY DataDate
you get:
Article EANUPC MRP DataDate grp
---------------------------------------
20171554001 1220636 599 2015-11-20 0
20171554001 1220636 599 2017-12-15 0
20171554001 1220636 390 2017-12-26 2
20171554001 1220636 390 2018-01-11 2
20171554001 1220636 290 2018-01-16 4
20171554001 1220636 290 2018-01-24 4
20171554001 1220636 190 2018-01-30 6
20171554001 1220636 90 2018-02-06 7
20171554001 1220636 90 2018-06-13 7
20171554001 1220636 599 2018-06-13 7
You can now wrap the above query in a CTE
and perform grouping using grp
field:
;WITH CTE AS (
SELECT Article, EANUPC, MRP, DataDate,
ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
FROM mytable
)
SELECT Article, EANUPC, MRP, MIN(DataDate) AS FromDate
FROM CTE
GROUP BY Article, EANUPC, MRP, grp
Output:
Article EANUPC MRP FromDate
----------------------------------
20171554001 1220636 599 2015-11-20
20171554001 1220636 390 2017-12-26
20171554001 1220636 290 2018-01-16
20171554001 1220636 190 2018-01-30
20171554001 1220636 90 2018-02-06
20171554001 1220636 599 2018-06-13
You can get the final result using LEAD
:
;WITH CTE AS (
SELECT Article, EANUPC, MRP, DataDate,
ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
FROM mytable
), CTE2 AS (
SELECT Article, EANUPC, MRP, MIN(DataDate) AS FromDate
FROM CTE
GROUP BY Article, EANUPC, MRP,grp
)
SELECT Article, EANUPC, MRP, FromDate,
COALESCE(DATEADD(day, -1, LEAD(FromDate) OVER (ORDER BY FromDate)), FromDate) AS ToDate
FROM CTE2
Output:
Article EANUPC MRP FromDate ToDate
----------------------------------------------
20171554001 1220636 599 2015-11-20 2017-12-25
20171554001 1220636 390 2017-12-26 2018-01-15
20171554001 1220636 290 2018-01-16 2018-01-29
20171554001 1220636 190 2018-01-30 2018-02-05
20171554001 1220636 90 2018-02-06 2018-06-12
20171554001 1220636 599 2018-06-13 2018-06-13