Search code examples
excelvba

Find all values in one column that do not trace to another column


I am trying to return a row of data if the value in column A which is a unique key for that row does not appear in a different sheet and column.

The keys look like this 5028-10/15/2021-10000000021-80 (the key is created based on the data in the row).

Each of the two data sets will have these keys in each row, and either data set could have 10,000 to 300,000+ rows. Generally speaking, the two data sets should contain the exact same data, but it needs to be confirmed, and the key allows for comparison. Normally, Vlookup is used and works just fine. However, I am trying to automate this process in VBA.

When I run the following code, it runs, but it runs for minutes (on test data with 30,000 rows) and doesn't return the correct result. Further, why is vlookup almost instant when returning the answer (I put it in the 30,000 rows and it has no problem).

I ran the below code on some dummy data where there was one row that did not match. So the code should return that one row. (or at least in the below code tell me the key).

Sub Summarize()
Dim WS1 As Worksheet
Dim WS2 As Worksheet
Dim WS3 As Worksheet

Dim C As Range

Set WS1 = Workbooks("Book1.xlsm").Sheets(1)
Set WS2 = Workbooks("Book2.xlsx").Sheets(1)
Set WS3 = Workbooks("Book3.xlsx").Sheets(1)

For Each Cell In WS2.Range("A2:A30100")
    With WS3.Range("A1:A30100")
        Set C = .Find(Cell.Value, LookIn:=xlValues)
        If Not C Is Nothing Then
            
        Else
            Debug.Print (cell.Value)
        End If
    End With
Next

End Sub

When running the code, it indicated that many of the keys could not be found when searching the data set, but as noted above, there is only one that does not trace.

I don't know why it's saying some keys can't be found, and why it is taking so long to search through the list.

As a note, I did get .find() to work while at the office (this is a reproduction of the code) but it was still very slow, and I feel that there has to be a faster way.


Solution

    • Performing operations (including search) on cell takes time. Executing the search many times (30099 * 30099) in your code is time-consuming.

    • It would be more efficient to validate data using an array in memory.

    • It only takes 0.13s to validate 40K rows based on my testing.

    Option Explicit
    Sub Demo()
        Dim objDic As Object
        Dim i As Long, sKey As String
        Dim arrData3, arrData2, lastRow As Long
        Dim WS2 As Worksheet, WS3 As Worksheet
        ' Set WS2 = Workbooks("Book2.xlsx").Sheets(1)
        ' Set WS3 = Workbooks("Book3.xlsx").Sheets(1)
        Set WS2 = ActiveWorkbook.Sheets(1) ' for testing
        Set WS3 = ActiveWorkbook.Sheets(2)
        Set objDic = CreateObject("scripting.dictionary")
        lastRow = WS3.Cells(WS3.Rows.Count, 1).End(xlUp).Row
        arrData3 = WS3.Range("A2:A" & lastRow).Value
        For i = LBound(arrData3) To UBound(arrData3)
            sKey = arrData3(i, 1)
            If Not objDic.exists(sKey) Then
                objDic(sKey) = i + 1
            End If
        Next i
        arrData2 = WS2.Range("A2:A" & lastRow).Value
        For i = LBound(arrData2) To UBound(arrData2)
            sKey = arrData2(i, 1)
            If Not objDic.exists(sKey) Then
                Debug.Print "Row : " & i + 1 & ", Value: " & sKey
            End If
        Next i
    End Sub