I have tried to find some records with a common table expression and in conjunction with a group by to find duplicateNo and subDuplicateNo of a specific table. Its all about finding duplicates from a table with a notification of duplicateNo and subDuplicateNo. But I'm wondering that subDuplicateNo always are 1. But how can I increment the column subDuplicateNo? I have until now searched a lot without any success :-(
My code is as below:
WITH cte AS (
SELECT t.Upper_Fornavn_FirstPart, t.Upper_Efternavn, Upper_Adresse, t.Telefon, Upper_postnr, Count(*) AS occurences,
ROW_NUMBER() OVER (ORDER BY t.Upper_Fornavn_FirstPart, t.Upper_Efternavn, Upper_Adresse, t.Telefon, Upper_postnr) as dubletno, t.subdubletno
FROM (
SELECT V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, MA2.Upper_Adresse, Telefon, Upper_postnr,
DENSE_RANK() OVER (PARTITION BY V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, MA2.Upper_Adresse, Telefon, Upper_postnr ORDER BY V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, MA2.Upper_Adresse, Telefon, Upper_postnr) AS subdubletno
FROM Medlemsdata V2
LEFT OUTER JOIN Postnumre
RIGHT OUTER JOIN MedlemsAdresse MA2 ON Postnumre.Postnummer = MA2.Postnr ON V2.FK_AdrID = MA2.AdrID
WHERE V2.vennenr > 0 AND V2.Upper_Fornavn_FirstPart <> '' AND V2.Upper_Efternavn <> '' AND Upper_Adresse <> '' AND V2.Telefon <> '' AND Upper_postnr <> ''
) as t
GROUP BY t.Upper_Fornavn_FirstPart, Upper_Efternavn, Upper_Adresse, Telefon, Upper_postnr, t.subdubletno
HAVING Count(*) > 1
)
SELECT V1.Vennenr, V1.Upper_Fornavn_FirstPart, V1.Upper_Efternavn, MA1.Upper_Adresse, V1.Telefon, MA1.Upper_postnr, cte.dubletno, cte.subdubletno
FROM
Medlemsdata V1
LEFT OUTER JOIN Postnumre
RIGHT OUTER JOIN MedlemsAdresse MA1 ON Postnumre.Postnummer = MA1.Postnr ON V1.FK_AdrID = MA1.AdrID
INNER JOIN cte ON
cte.Upper_Fornavn_FirstPart = V1.Upper_Fornavn_FirstPart AND
cte.Upper_Efternavn = v1.Upper_Efternavn AND
cte.Upper_Adresse = MA1.Upper_Adresse AND
cte.Upper_postnr = MA1.Upper_postnr AND
cte.Telefon = V1.Telefon
WHERE V1.vennenr > 0 and cte.occurences > 1
and the actual and the expected data are as follows:
Actual result:
ID Upper_Fornavn_FirstPart UPPER_EFTERNAVN UPPER_ADRESSE TELEFON UPPER_POSTNR DubletNo SubDubletNo
===============================================================================================================
1 ANNE JENSEN BIRKEHAVEN 2 12345678 7100 1 1
2 ANNE JENSEN BIRKEHAVEN 2 12345678 7100 1 1
3 TOM HANSEN SLOTSGADE 1 98765432 6000 2 1
4 TOM HANSEN SLOTSGADE 1 98765432 6000 2 1
---------------------------------------------------------------------------------------------------------------
Expected result:
ID Upper_Fornavn_FirstPart UPPER_EFTERNAVN UPPER_ADRESSE TELEFON UPPER_POSTNR DubletNo SubDubletNo
===============================================================================================================
1 ANNE JENSEN BIRKEHAVEN 2 12345678 7100 1 1
2 ANNE JENSEN BIRKEHAVEN 2 12345678 7100 1 2
3 TOM HANSEN SLOTSGADE 1 98765432 6000 2 1
4 TOM HANSEN SLOTSGADE 1 98765432 6000 2 2
Thanks in advance for any kind of help.
I really hope that this technically can be done ...
Regards
Michael
Update: I have tried the following code - but now it gives me 1 in all records of both duplicateNo and subDuplicateNo :-(
WITH cte AS (
SELECT t.Upper_Fornavn_FirstPart, t.Upper_Efternavn, Upper_Adresse, t.Telefon, Upper_postnr, t.dubletno, t.subdubletno
FROM (
SELECT V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, MA2.Upper_Adresse, Telefon, Upper_postnr,
COUNT(*) OVER (PARTITION BY V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, Upper_Adresse, v2.Telefon, Upper_postnr) as occurences,
DENSE_RANK() OVER (PARTITION BY V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, MA2.Upper_Adresse, Telefon, Upper_postnr ORDER BY Upper_postnr) AS dubletno,
ROW_NUMBER() OVER (PARTITION BY V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, MA2.Upper_Adresse, Telefon, Upper_postnr ORDER BY Upper_postnr) AS subdubletno
FROM Medlemsdata V2
LEFT OUTER JOIN Postnumre
RIGHT OUTER JOIN MedlemsAdresse MA2 ON Postnumre.Postnummer = MA2.Postnr ON V2.FK_AdrID = MA2.AdrID
WHERE V2.vennenr > 0 AND V2.Upper_Fornavn_FirstPart <> '' AND V2.Upper_Efternavn <> '' AND Upper_Adresse <> '' AND V2.Telefon <> '' AND Upper_postnr <> ''
GROUP BY Upper_Fornavn_FirstPart, Upper_Efternavn, Upper_Adresse, Telefon, Upper_postnr
HAVING Count(*) > 1
) as t
)
SELECT V1.Vennenr, V1.Upper_Fornavn_FirstPart, V1.Upper_Efternavn, MA1.Upper_Adresse, V1.Telefon, MA1.Upper_postnr, cte.dubletno, cte.subdubletno
FROM
Medlemsdata V1
LEFT OUTER JOIN Postnumre
RIGHT OUTER JOIN MedlemsAdresse MA1 ON Postnumre.Postnummer = MA1.Postnr ON V1.FK_AdrID = MA1.AdrID
INNER JOIN cte ON
cte.Upper_Fornavn_FirstPart = V1.Upper_Fornavn_FirstPart AND
cte.Upper_Efternavn = v1.Upper_Efternavn AND
cte.Upper_Adresse = MA1.Upper_Adresse AND
cte.Upper_postnr = MA1.Upper_postnr AND
cte.Telefon = V1.Telefon
WHERE V1.vennenr > 0
This query appears to be doing a lot of extra work. I also suspect your right join construction could be simplified.
SELECT t.*
FROM (
SELECT V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, V2.Upper_Adresse, Telefon, Upper_postnr,
COUNT(*) OVER (PARTITION BY V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, V2.Upper_Adresse, Telefon, Upper_postnr) as occurences,
DENSE_RANK() OVER (ORDER BY V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, V2.Upper_Adresse, Telefon, Upper_postnr) as dubletno,
ROW_NUMBER() OVER (PARTITION BY V2.Upper_Fornavn_FirstPart, V2.Upper_Efternavn, V2.Upper_Adresse, Telefon, Upper_postnr ORDER BY Upper_postnr) AS subdubletno
FROM data V2
WHERE V2.vennenr > 0 AND V2.Upper_Fornavn_FirstPart <> '' AND V2.Upper_Efternavn <> '' AND Upper_Adresse <> '' AND V2.Telefon <> '' AND Upper_postnr <> ''
) as t
WHERE t.occurences > 0;