Search code examples
oraclejoinuniquedistinct

Oracle select distinct with join and multiple columns


Unfortunately this database has a ton of duplicate email addresses in it. I need to do a query and return only unique emails, doesn't really matter which one.

The query I have looks like this, can't really figure out what to add to not get duplicate emails returned. Can anyone think of anything?

select c.cid, c.email, c.uuid, e.code
from c
    inner join e on e.cid = c.cid
where regexp_like(c.email, '\.net$', 'i');

-- Adding some additional info on request

The above query returns the following results, where you can see there are duplicates. I'm interested in only returning one row per unique email address.

3478|[email protected]|ouskns;dhf|1
3488|[email protected]|jlsudo;uff|0
3598|[email protected]|dl;udjffff|1
3798|[email protected]|osuosdujff|1
3888|[email protected]|odsos7jfff|1

-- Solution, thanks Mathguy

select cid, email, uuid, code
from
  (select c.cid, c.email, c.uuid, e.code, row_number() over (partition by
    c.email order by null) as rn
   from c
    inner join e on e.cid = c.cid
   where regexp_like(c.email, '\.net$', 'i')
  )
where rn = 1;

Solution

  • If it works as is and the only problem is the duplicates, you can change c.email to MAX(c.email) as email in the select clause, and add a group by clause to group by the other columns included in select.

    EDIT: (actually I should delete the original answer since the OP clarified his question was quite different from what he seemed to ask originally - but that would also delete the comments... so editing instead)

    If your query produces the desired results, but now you must pick just one random row per email address, you can try this:

    select cid, email, uuid, code
    from
       (  -- ....   copy your select query here
          --        ADD one column to the select line like so:
          --        select c.cid, c.uuid, c.email, e.code, 
          --                     row_number() over (partition by c.email order by null) as rn
          -- ....
       )
    where rn = 1;