I have the following requirement to remove any duplicate zero value rows, where there exists a matching row with a non zero value. The match case is done on all the other columns.
for example,
values | type | month_year | cli_name | media | ordering
50 | Revenue | Nov_20 | google | agency | sample
0 | Revenue | Nov_20 | google | agency | sample
I want the zero row here removed, because there is a matching row on the type,month_year,cli_name,media,ordering columns, and the values column is not zero.
I have the following cte function in order to do this,
WITH CTE AS(
SELECT a.*,ROW_NUMBER() OVER (PARTITION BY a.type, a.month_year, a.cli_name, a.media, a.ordering
ORDER BY a.type, a.month_year, a.cli_name, a.media, a.ordering)as RN
FROM mytable a
)
DELETE FROM CTE WHERE RN > 1 and [values] = 0
The behaviour of the function is quite strange. When first run, it only removes a portion of the duplicates. I need to run it a couple more times, and each time, it picks up more duplicates and removes them, until I run it (in my scenario, the 4th time) and it doesn't remove anymore. The problem here is that duplicate zero value rows still remain in the table.
I know this is the case because running the below query still returns results after the cte function has stopped finding/removing duplicates:
SELECT DISTINCT b.[values],a.[values], b.[type], b.month_year, b.cli_name, b.media, b.ordering
FROM mytable a
JOIN mytable b on b.[type] = a.[type]
and b.month_year= a.month_year
and b.cli_name= a.cli_name
and b.media = a.media
and b.ordering= a.ordering
where a.[values] <> 0 and b.[values] = 0
So i've 2 questions,
I assume there's a flaw with the cte function but i'm struggling to spot it, appreciate any help!
The ORDER BY allows zero values in the first position. These values will not be deleted (WHERE RN > 1 and [values] = 0). It could take several iterations before the non-zero number reaches the first position
Change ORDER BY to a.value DESC
The above assumes that non-zero values are positive numbers (as shown). IF a non-zero [value] can be positive or negative, you will need to order by the absolute value: ABS(a.value)