I have data that looks like this:
invoice | dept | date |
---|---|---|
12345 | Xfer Bills | 2022-03-05 |
12345 | Vendor Mgmt | 2023-07-15 |
12345 | Government | 2023-07-29 |
12345 | Vendor Mgmt | 2023-09-16 |
12345 | Government | 2023-11-25 |
12345 | Government | 2023-12-30 |
12345 | Government | 2024-02-03 |
What I am trying to get back is the following:
invoice | dept | date |
---|---|---|
12345 | Xfer Bills | 2022-03-05 |
12345 | Vendor Mgmt | 2023-07-15 |
12345 | Government | 2023-07-29 |
12345 | Vendor Mgmt | 2023-09-16 |
12345 | Government | 2023-11-25 |
I was thinking of trying to do a Lead or Lag and see if the next/previous had changed. I am trying to get a path of an invoice
You can use LAG
within a CTE to compare each row's department dept
with the department of the preceding row for the same invoice, ordered by date. You can see that I created a new column PrevDept
to hold the value of the department from the previous row.
So, I select the rows where PrevDept
is either NULL
or different from the current dept
are selected, (excluding the ones of the same department and leaving only the first occurrence in each series of consecutive rows) :
CREATE TABLE InvoiceData (
invoice INT,
dept VARCHAR(50),
[date] DATE
);
INSERT INTO InvoiceData (invoice, dept, date)
VALUES
(12345, 'Xfer Bills', '2022-03-05'),
(12345, 'Vendor Mgmt', '2023-07-15'),
(12345, 'Government', '2023-07-29'),
(12345, 'Vendor Mgmt', '2023-09-16'),
(12345, 'Government', '2023-11-25'),
(12345, 'Government', '2023-12-30'),
(12345, 'Government', '2024-02-03');
WITH CTE AS (
SELECT
invoice,
dept,
date,
LAG(dept, 1) OVER(PARTITION BY invoice ORDER BY [date]) AS PrevDept
FROM InvoiceData
)
SELECT
invoice,
dept,
[date]
FROM CTE
WHERE PrevDept IS NULL OR PrevDept <> dept;
invoice | dept | date |
---|---|---|
12345 | Xfer Bills | 2022-03-05 |
12345 | Vendor Mgmt | 2023-07-15 |
12345 | Government | 2023-07-29 |
12345 | Vendor Mgmt | 2023-09-16 |
12345 | Government | 2023-11-25 |