Search code examples
vbaexcelworksheet-function

Finding and counting number of duplicates


I have a spreadsheet with a column called NumberID that has about 50k records. I am aware that there are duplicates however with scrolling up/down it takes forever to find anything plus often times excel is being somewhat slow. I'm trying to write a quick snippet of code to be able to find and count the number of duplicates.

I'm trying to write a quick way of doing it, basically my data is from rows 20 to 48210 and I'm trying to find a number total duplicate records.

Dim lastRow As Long
Dim matchFoundIndex As Long
Dim iCntr As Long
Dim count As Long
count = 0
lastRow = Range("B48210").End(xlUp).Row
For iCntr = 1 To lastRow
    If Cells(iCntr, 1) <> "" Then
       matchFoundIndex = WorksheetFunction.Match(Cells(iCntr, 1), Range("B20:B" & lastRow), 0)
        If iCntr <> matchFoundIndex Then
            count = count + 1
        End If
     End If
Next

MsgBox count

Here im getting an error on = WorkSheetFunction.Match - i found that this property can be used to accomplish what I'm trying to do. The error says

Unable to get the match property for the worksheetfunction class.

Someone have an idea? My vba has been rusty


Solution

  • since you want to "count the number of duplicates", a very fast way of doing that is exploiting RemoveDuplicates() method of Range object, like follows:

    Option Explicit
    
    Sub main()
        Dim helperCol As Range
        Dim count As Long
    
        With Worksheets("IDs") '<--| reference your relevant sheet (change "IDs" to youtr actual sheet name)
            Set helperCol = .UsedRange.Resize(, 1).Offset(, .UsedRange.Columns.count) '<--| set a "helper" range where to store unique identifiers
            With .Range("A1", .Cells(.Rows.count, 1).End(xlUp)) '<-- reference "IDs" column from row 1 (header) to last not empty cell
                helperCol.Value = .Value '<--| copy identifiers to "helper" range
                helperCol.RemoveDuplicates Columns:=1, Header:=xlYes '<--| remove duplicates in copied identifiers
                count = .SpecialCells(xlCellTypeConstants).count - helperCol.SpecialCells(xlCellTypeConstants).count '<--| count duplicates as the difference between original IDs number and unique ones
            End With
            helperCol.ClearContents '<--| clear "helper" range
        End With
        MsgBox count & " duplicates"
    End Sub