I am struggling with the Vertica SQL Statement to delete all entries of an OrderID after a certain event had occurred.
This is a representation of my data.
**OrderID Activity**
1 Item Started
1 Task 1 complete
1 In storage
1 Deletion
1 Deletion Complete
2 Item Started
2 Deletion
2 Deletion Complete
In this scenario, I would like to delete the entries "Deletion" and "Deletion Complete" of the OrderID 1, since they occur after the activity "In storage". I want to keep all entries of OrderID 1 until (and including) "In Storage".
For OrderID 2 this is not the case, therefore I want to keep it.
If your situation is exactly as you described it - and you indeed have no relevant timestamps, to check what happened before and what happened later - then you can delete all rows whose activity is either 'Deletion' or 'Deletion Complete' and whose orderid is equal to any order id of any rows that have an activity of 'In storage'.
If, however, 'In storage' needs to happen before 'Deletion' or 'Deletion Complete', then you also need to verify that in the subselect I'm using below.
-- create the example table - drop any table of that name existing before ...
DROP TABLE IF EXISTS input;
CREATE TABLE input(OrderID,Activity) AS (
SELECT 1,'Item Started'
UNION ALL SELECT 1,'Task 1 complete'
UNION ALL SELECT 1,'In storage'
UNION ALL SELECT 1,'Deletion'
UNION ALL SELECT 1,'Deletion Complete'
UNION ALL SELECT 2,'Item Started'
UNION ALL SELECT 2,'Deletion'
UNION ALL SELECT 2,'Deletion Complete'
);
-- here goes
DELETE
FROM input
WHERE orderid IN (
SELECT orderid FROM input WHERE activity='In storage'
)
AND activity IN('Deletion','Deletion Complete')
;