Search code examples
sqlgroup-bysalesforce-marketing-cloud

SQL GROUP BY Results - Salesforce Marketing Cloud


I'm looking to find all duplicate records and then select all duplicates minus the oldest record from each set so that I can delete duplicates and keep one unique record.

When I run this query, I get the results I want. Leaves me with a unique email address and the oldest create date.

SELECT 
    EmailAddress,
    MIN(CreatedDate)
FROM [_ListSubscribers]
WHERE EmailAddress IN
    (
        SELECT EmailAddress
        FROM _ListSubscribers
        GROUP BY EmailAddress
        HAVING COUNT(EmailAddress) > 1
    )
GROUP BY EmailAddress

When I add SubscriberKey to the query, the results DOUBLE! Why is that? I just want to see the SubscriberKey tied to the EmailAddress I found that has the oldest date in the subquery.

SELECT 
    EmailAddress,
    SubscriberKey,
    MIN(CreatedDate)
FROM [_ListSubscribers]
WHERE EmailAddress IN
    (
        SELECT EmailAddress
        FROM _ListSubscribers
        GROUP BY EmailAddress
        HAVING COUNT(EmailAddress) > 1
    )
GROUP BY EmailAddress, SubscriberKey

Solution

  • You're getting multiple records because you're grouping by SubscriberKey. You'll need to match by EmailAddress and CreatedDate. Try doing a sub query and joining it back to your original table.

    select 
    [_ListSubscribers].EmailAddress,
    [_ListSubscribers].SubscriberKey,
    [_ListSubscribers].CreatedDate,
    from
    (
    SELECT 
        EmailAddress,
        MIN(CreatedDate) as CreatedDate
        FROM [_ListSubscribers]
        GROUP BY EmailAddress, SubscriberKey
        Having count(EmailAddress)>1
    ) SubTbl
    inner join
    [_ListSubscribers] on
    [_ListSubscribers].EmailAddress = SubTbl.EmailAddress
    and
    [_ListSubscribers].CreatedDate = SubTbl.CreatedDate