Here is my question:
I have the following data that I am looking to de-dupe and generally clean, the fields are:
RDBMS is Azure SQL
What I'm wanting to do is to do a Group-By on Name, Phone Number and Email and then record the UniqueID in a separate column e.g.:
NewUniqueID Name PhoneNumber Email OldIDs
1 Sam Smith 091234567 [email protected] 12,54,65,70
However, I'm not sure what the correct terminology is to create the OldIDs field, so I've been unable to find out how to do it.
You seem to want string aggregation:
select
row_number() over(order by (select null)) newUniqueID,
name,
phoneNumber,
email,
string_agg(uniqueID, ',') oldIDs
from mytable
group by name, phoneNumber, email
This generates one row per (name, phoneNumber, email)
tuple, and concatenate all uniqueID
s in column oldIDs
. row_number()
assigns a new id
to each row in the resultset.