Currently, the table is ordered in ascending order by row_number
. I need help removing duplicates based on 2 conditions.
stage
, that is online
then I want to keep that row, doesn't matter which one, there can be multiple.online
for that org_id
, then I keep row_number
= 1
which would be the oldest element.sales_id | org_id | stage | row_number |
---|---|---|---|
ccc_123 | ccc | off-line | 1 |
ccc_123 | ccc | off-line | 2 |
ccc_123 | ccc | online | 3 |
abc_123 | abc | off-line | 1 |
abc_123 | abc | power-off | 2 |
zzz_123 | zzz | power-off | 1 |
so the table should look like this after:
sales_id | org_id | stage |
---|---|---|
ccc_123 | ccc | online |
abc_123 | abc | off-line |
zzz_123 | zzz | power-off |
Looks like this, stackoverflow not working well with second table for some reason
I would use a combination of a CASE
statement to modify the rownumber of records with stage='online' and then use ROW_NUMBER
to allow me to filter for the lowest value in a group.
http://sqlfiddle.com/#!17/1421b/5
create table sales_stage (
sales_id varchar,
org_id varchar,
stage varchar,
row_num int);
insert into sales_stage (sales_id, org_id, stage, row_num) values
('ccc_123', 'ccc', 'off-line', 1),
('ccc_123', 'ccc', 'off-line', 2),
('ccc_123', 'ccc', 'online', 3),
('abc_123', 'abc', 'off-line', 1),
('abc_123', 'abc', 'power-off', 2),
('zzz_123', 'zzz', 'power-off', 1);
SELECT
sales_id, org_id, stage
FROM
(
SELECT
sales_id, org_id, stage,
ROW_NUMBER() OVER(PARTITION BY sales_id, org_id ORDER BY row_num) as rn
FROM (
SELECT sales_id, org_id, stage,
CASE WHEN stage='online' THEN -999 ELSE row_num END as row_num
FROM sales_stage
) x
) y
WHERE rn = 1