Search code examples
sql-serveruniquedistinct-values

SQL Select distinct from multiple fields returning only one row


I have a table with the following columns in SQL Server:

MEMBERID, MEMBEREMAIL, FATHEREMAIL, MOTHEREMAIL, MEMBERNAME

MEMBERID is PK. The three email columns are not unique, so the same email may appear several times in the same row AND in several rows.

I am trying to extract a unique list of emails, and for each email also get a memberid and membername (it does not matter from which record).

For example if I have three rows:

1   [email protected]   [email protected]   [email protected]   Mark
2   [email protected]   [email protected]   [email protected]   John
3   [email protected]   [email protected]   [email protected]   Susan

I want to get the three emails ([email protected], [email protected], [email protected]) and for each of those a MEMBERID in which they appear. It does NOT which MEMBERID (for example for [email protected] I don't care if I get the values 1 and Mark or 2 and John or 3 and Susan, as long as [email protected] appears only once in the results.

If I use DISTINCT when trying to return the email and memberid and membername, of course I get all of the rows.


Solution

  • You could use a subquery to normalize all emails. Then you can use row_number to filter out one memberid, membername per email:

    select  *
    from    (
            select  row_number() over (partition by email order by memberid) as rn
            ,       *
            from    (
                    select  MEMBERID
                    ,       MEMBERNAME
                    ,       MEMBEREMAIL as email
                    from    YourTable
                    union all
                    select  MEMBERID
                    ,       MEMBERNAME
                    ,       FATHEREMAIL
                    from    YourTable
                    union all
                    select  MEMBERID
                    ,       MEMBERNAME
                    ,       MOTHEREMAIL
                    from    YourTable
                    ) as emails
            ) num_emails
    where   rn = 1
    

    You could also normalize the emails using the UNPIVOT clause, like this:

    select  *
    from    (
            select  row_number() over (partition by email order by memberid) as rn
            ,       *
            from    (
                    select  MEMBERID
                    ,       MEMBERNAME
                    ,       email
                    from    YourTable
                    unpivot (
                                    email
                            for     emailOwner
                            in      (
                                    MEMBEREMAIL,
                                    FATHEREMAIL,
                                    MOTHEREMAIL
                                    )
                            ) as u
                    ) as emails
            ) num_emails
    where   rn = 1
    

    Try both versions at SQL Fiddle: