I have a data set continuously being updated with date values that looks like this:
Part Number Product Status Phase1 Phase2 Phase3 Phase4 Phase5 Phase6
FD 2000 Maintain 2020-01-03 00:00:00.000 2020-01-08 00:00:00.000 2020-01-15 00:00:00.000 2020-01-17 00:00:00.000 2020-01-22 00:00:00.000 2020-01-31 00:00:00.000
FD 2001 Maintain 2020-01-03 00:00:00.000 2020-01-08 00:00:00.000 2020-01-15 00:00:00.000 2020-01-17 00:00:00.000 2020-01-22 00:00:00.000 2020-01-31 00:00:00.000
Basically, the table is made up of a part number and it's respective status and life cycle dates.
Then, I utilize CROSS APPLY
to get transpose the different phase columns into one column. Here is the SQL syntax:
SELECT [Part Number], [Product Status], Phase, Date
FROM PLCMexample
CROSS APPLY (VALUES ('Phase1', [Phase1]),
('Phase2', [Phase2]),
('Phase3', [Phase3]),
('Phase4', [Phase4]),
('Phase5', [Phase5]),
('Phase6', [Phase6]))
CrossApplied (Phase, Date)
GO
This generates this view:
Part Number Product Status Phase Date
FD 2000 Maintain Registration Initiation 2020-01-03 00:00:00.000
FD 2000 Maintain Product Launch 2020-01-08 00:00:00.000
FD 2000 Maintain Phase Out 2020-01-15 00:00:00.000
FD 2000 Maintain Last Order 2020-01-17 00:00:00.000
FD 2000 Maintain Last Shipment 2020-01-22 00:00:00.000
FD 2000 Maintain Last Service 2020-01-31 00:00:00.000
FD 2001 Maintain Registration Initiation 2020-01-03 00:00:00.000
FD 2001 Maintain Product Launch 2020-01-08 00:00:00.000
FD 2001 Maintain Phase Out 2020-01-15 00:00:00.000
FD 2001 Maintain Last Order 2020-01-17 00:00:00.000
FD 2001 Maintain Last Shipment 2020-01-22 00:00:00.000
FD 2001 Maintain Last Service 2020-01-31 00:00:00.000
Which is all well and good, but now I need not only just a Date
column for the view, but StartDate
and EndDate
. The StartDate
would be the values in the current Date
column. The EndDate
would be the same Date
as the StartDate
of the following phase. Here's the first six rows of the view as an example of how it should look:
Part Number Product Status Phase StartDate EndDate
FD 2000 Maintain Registration Initiation 2020-01-03 00:00:00.000 2020-01-08 00:00:00.000
FD 2000 Maintain Product Launch 2020-01-08 00:00:00.000 2020-01-15 00:00:00.000
FD 2000 Maintain Phase Out 2020-01-15 00:00:00.000 2020-01-17 00:00:00.000
FD 2000 Maintain Last Order 2020-01-17 00:00:00.000 2020-01-22 00:00:00.000
FD 2000 Maintain Last Shipment 2020-01-22 00:00:00.000 2020-01-31 00:00:00.000
FD 2000 Maintain Last Service 2020-01-31 00:00:00.000 2020-01-31 00:00:00.000
As you can see, the "Registration Initiation" EndDate
is the same as the StartDate
of "Product Launch". The last phase, "Last Service", will have the same StartDate
and EndDate
.
What's the best way to handle this? Should I create a calculated column before or after the transposition? This is just a snippet, there are thousands of rows of data just like this so I would like to optimize the code.
I think you want lead()
:
SELECT [Part Number], [Product Status], Phase, Date,
LEAD(Date) OVER (PARTITION BY [Part Number] ORDER BY Date) as Next_Date
FROM PLCMexample CROSS APPLY
(VALUES ('Phase1', [Phase1]),
('Phase2', [Phase2]),
('Phase3', [Phase3]),
('Phase4', [Phase4]),
('Phase5', [Phase5]),
('Phase6', [Phase6])
) CrossApplied (Phase, Date);
EDIT:
If you want a default value, use the 3-argument form of LEAD()
:
LEAD(Date, 1, DATEADD(DAY, 100, DATE) OVER (PARTITION BY [Part Number] ORDER BY Date) as Next_Date