My dataset looks like this. For every combination of customerid,orderid and ship date, i would like to retrieve 1 process date that is less than or equal to the ship date. If the process date is greater than the ship date and no lower process date exist, then use the ship date as the process date
+-------------+----------+------------+--------------+--+
| Customer ID | Order ID | Ship Date | Process Date | |
+-------------+----------+------------+--------------+--+
| 1000 | 100 | 9/17/2020 | 9/17/2020 | |
| 1000 | 100 | 9/17/2020 | 10/16/2020 | |
| 1000 | 100 | 9/17/2020 | 9/16/2020 | |
| 2000 | 200 | 8/15/2020 | 8/13/2020 | |
| 2000 | 300 | 10/14/2020 | 10/13/2020 | |
| 3000 | 400 | 3/4/2020 | 4/2/2020 | |
| 3000 | 400 | 3/4/2020 | 3/3/2020 | |
| 3000 | 400 | 3/4/2020 | 3/5/2020 | |
| 4000 | 500 | 5/1/2020 | 5/3/2020 | |
| 5000 | 600 | 6/1/2020 | 7/1/2020 | |
| 5000 | 600 | 6/1/2020 | 7/2/2020
| 6000 | 700 | 7/14/2020 | 7/13/2020 | |
| 6000 | 700 | 7/14/2020 | 6/10/2020 | |
+-------------+----------+------------+--------------+--+ | |
+-------------+----------+------------+--------------+--+
Desired Output
+-------------+----------+------------+--------------+--+
| Customer ID | Order ID | Ship Date | Process Date | |
+-------------+----------+------------+--------------+--+
| 1000 | 100 | 9/17/2020 | 9/17/2020 | |
| 2000 | 200 | 8/15/2020 | 8/13/2020 | |
| 2000 | 300 | 10/14/2020 | 10/13/2020 | |
| 3000 | 400 | 3/4/2020 | 3/3/2020 | |
| 4000 | 500 | 5/1/2020 | 5/1/2020 | |
| 5000 | 600 | 6/1/2020 | 6/1/2020 | |
| 6000 | 700 | 7/14/2020 | 7/13/2020 | |
+-------------+----------+------------+--------------+--+
I tried using ROWNUM
and date difference, but I'm stuck after getting the row number in ascending order.Not sure how to proceed ahead.
"If the process date is greater than the ship date and no lower process date exist, then use the ship date as the process date."
Do a GROUP BY
. You can use MAX()
to return the latest ProcessDate <= ShipDate. If no such ProcessDate exists, return ShipDate.
select CustomerID, orderID, ShipDate,
coalesce(MAX(case when ProcessDate <= ShipDate then ProcessDate end), ShipDate)
from tablename
group by CustomerID, orderID, ShipDate