I would like to follow up with a question from my earlier POST: Delete or change records in ETL
The problem mentioned there was solved with the following:
; with todelete as (
select *,
count(*) over (partition by label) as cnt,
lag(cost) over (partition by label order by time ASC) as lastcost
ROW_NUMBER() over (partition by label order by time ASC) as r_number
from Table1
)
delete from todelete
where cnt > 1 and r_number between 1 and (cnt/2)*2 and cost=ISNULL(lastcost,cost)
, during testing, however, I came across one problem when these movements occur in the table (they cannot be prevented).:
label cost time
x2 29 14/5/2020 01:00:00
x3 20 14/5/2020 01:02:00
x2 29 15/5/2020 03:12:02
That for the same 'Label' I have two identical lines (except timestamp) with the same 'cost'. And just the above solution will delete these two records. Which I need only the older one to be deleted.
Thx for the advice
UPDATE:
MY goal is.
i have table with records:
label cost time
x2 29 14/5/2020 01:00:00
x3 20 14/5/2020 01:02:00
x2 30 15/5/2020 03:12:02
Now i have delete function:
; with todelete as (
select *,
count(*) over (partition by label) as cnt,
lag(cost) over (partition by label order by time ASC) as lastcost
ROW_NUMBER() over (partition by label order by time ASC) as r_number
from Table1
)
delete from todelete
where cnt > 1 and r_number between 1 and (cnt/2)*2 and cost=ISNULL(lastcost,cost)
With which I get mine they wanted a table:
label cost time
x3 20 14/5/2020 01:02:00
x2 30 15/5/2020 03:12:02
But the problem occurs when the original table looks like:
label cost time
x2 29 14/5/2020 01:00:00
x3 20 14/5/2020 01:02:00
x2 29 15/5/2020 03:12:02
Now Delete function(mentioned above)
I'll get a table:
label cost time
x3 20 14/5/2020 01:02:00
With the above delete function, both records for the label "X2" will be deleted, but I only want to delete the older one.
Nobody anything?
I try this: I couldn't solve it. Here in this you see that it will delete both records for me (I just want the older one): https://rextester.com/TLLQ93275
In this case, it works correctly, but if "x2" has the same price (for example, 29), it will also delete both entries. https://rextester.com/RHB70490
UPDATE:
I finally managed to solve the problem. I added another ranking function and conditioned it appropriately.
; with todelete as (
select *,
count(*) over (partition by label) as cnt,
lag(cost) over (partition by label order by time ASC) as lastcost
,ROW_NUMBER() over (partition by label order by time ASC) as r_number
,ROW_NUMBER() over (partition by label order by time DESC) as r_number2
,RANK() over (partition by cost order by time asc) as TEST
,Row_NUMBER() over (partition by label order by TIME DESC) as TEST2
from Table1
)
DELETE from todelete
where (cnt > 1 and r_number between 1 and (cnt/2)*2 and cost=ISNULL(lastcost,cost) AND TEST2 !=1) OR (cnt>1 AND TEST2<>1 AND r_number2 != 1)
For illustration, here: https://rextester.com/DONME54328