Unfortunately this database has a ton of duplicate email addresses in it. I need to do a query and return only unique emails, doesn't really matter which one.
The query I have looks like this, can't really figure out what to add to not get duplicate emails returned. Can anyone think of anything?
select c.cid, c.email, c.uuid, e.code
from c
inner join e on e.cid = c.cid
where regexp_like(c.email, '\.net$', 'i');
-- Adding some additional info on request
The above query returns the following results, where you can see there are duplicates. I'm interested in only returning one row per unique email address.
3478|[email protected]|ouskns;dhf|1
3488|[email protected]|jlsudo;uff|0
3598|[email protected]|dl;udjffff|1
3798|[email protected]|osuosdujff|1
3888|[email protected]|odsos7jfff|1
-- Solution, thanks Mathguy
select cid, email, uuid, code
from
(select c.cid, c.email, c.uuid, e.code, row_number() over (partition by
c.email order by null) as rn
from c
inner join e on e.cid = c.cid
where regexp_like(c.email, '\.net$', 'i')
)
where rn = 1;
If it works as is and the only problem is the duplicates, you can change c.email
to MAX(c.email) as email
in the select
clause, and add a group by clause to group by the other columns included in select
.
EDIT: (actually I should delete the original answer since the OP clarified his question was quite different from what he seemed to ask originally - but that would also delete the comments... so editing instead)
If your query produces the desired results, but now you must pick just one random row per email address, you can try this:
select cid, email, uuid, code
from
( -- .... copy your select query here
-- ADD one column to the select line like so:
-- select c.cid, c.uuid, c.email, e.code,
-- row_number() over (partition by c.email order by null) as rn
-- ....
)
where rn = 1;