Search code examples
excelvbauniqueuniqueidentifierdistinct-values

Unique Count Formula for large dataset


I am having trouble determining a way to enter a 1 or 0 into an adjacent cell to indicate whether or not a value is unique when working with a large dataset. I have read of multiple methods for accomplishing this, however none of them seem efficient for my purposes: I am using an instance of Excel 2010 (so I do not have the Distinct Count feature in PivotTables, and when I try to use PowerPivot it crashes my file due to processing limitations.

In this StackOverflow question: Simple Pivot Table to Count Unique Values there are suggestions to use SUMPRODUCT or COUNTIF, but when working with 50,000+ rows as I am, this causes terrible performance and a file size of ~35 MB instead of ~3 MB. I wanted to know if there is a better solution for a large, dynamic dataset whether it is a formula or VBA.

An example of what I would like to accomplish is (with the Unique column being the adjacent cell):

Name   Week   Unique
John   1      1
Sally  1      1
John   1      0
Sally  2      1

I attempted to script the same functionality of COUNTIF but with no success:

For Each Cell In ThisWorkbook.Worksheets("Overtime & Type Data").Range("Z2:Z" & DataLastRow)
If Worksheets("Overtime & Type Data").Cells(Cell.Row, 26) <> Worksheets("Overtime & Type Data").Cells(Cell.Row - 1, 26) Then
FirstCell = Cell.Row
End If
If (Worksheets("Overtime & Type Data").Range(Cells(FirstCell, 26), Cells(Cell.Row, 26)) = Worksheets("Overtime & Type Data").Range(Cells(Cell.Row, 26))) = True Then
    Cell.Value = 1
Else
    Cell.Value = 0
End If
Next Cell

Solution

  • This code ran on over 130,000 rows successfully in less than 3 seconds. Adjust the column letters to fit your dataset.

    Sub tgr()
    
        Const colName As String = "A"
        Const colWeek As String = "B"
        Const colOutput As String = "C"
    
        Dim ws As Worksheet
        Dim rngData As Range
        Dim DataCell As Range
        Dim rngFound As Range
        Dim collUniques As Collection
        Dim arrResults() As Long
        Dim ResultIndex As Long
        Dim UnqCount As Long
    
        Set ws = ThisWorkbook.Sheets("Overtime & Type Data")
        Set rngData = ws.Range(colName & 2, ws.Cells(Rows.Count, colName).End(xlUp))
        Set collUniques = New Collection
        ReDim arrResults(1 To rngData.Cells.Count, 1 To 1)
    
        On Error Resume Next
        For Each DataCell In rngData.Cells
            ResultIndex = ResultIndex + 1
            collUniques.Add ws.Cells(DataCell.Row, colName).Value & ws.Cells(DataCell.Row, colWeek).Value, ws.Cells(DataCell.Row, colName).Value & ws.Cells(DataCell.Row, colWeek).Value
            If collUniques.Count > UnqCount Then
                UnqCount = collUniques.Count
                arrResults(ResultIndex, 1) = 1
            Else
                arrResults(ResultIndex, 1) = 0
            End If
        Next DataCell
        On Error GoTo 0
    
        ws.Cells(rngData.Row, colOutput).Resize(rngData.Cells.Count).Value = arrResults
    
    End Sub