I am trying to return a row of data if the value in column A which is a unique key for that row does not appear in a different sheet and column.
The keys look like this 5028-10/15/2021-10000000021-80 (the key is created based on the data in the row).
Each of the two data sets will have these keys in each row, and either data set could have 10,000 to 300,000+ rows. Generally speaking, the two data sets should contain the exact same data, but it needs to be confirmed, and the key allows for comparison. Normally, Vlookup is used and works just fine. However, I am trying to automate this process in VBA.
When I run the following code, it runs, but it runs for minutes (on test data with 30,000 rows) and doesn't return the correct result. Further, why is vlookup almost instant when returning the answer (I put it in the 30,000 rows and it has no problem).
I ran the below code on some dummy data where there was one row that did not match. So the code should return that one row. (or at least in the below code tell me the key).
Sub Summarize()
Dim WS1 As Worksheet
Dim WS2 As Worksheet
Dim WS3 As Worksheet
Dim C As Range
Set WS1 = Workbooks("Book1.xlsm").Sheets(1)
Set WS2 = Workbooks("Book2.xlsx").Sheets(1)
Set WS3 = Workbooks("Book3.xlsx").Sheets(1)
For Each Cell In WS2.Range("A2:A30100")
With WS3.Range("A1:A30100")
Set C = .Find(Cell.Value, LookIn:=xlValues)
If Not C Is Nothing Then
Else
Debug.Print (cell.Value)
End If
End With
Next
End Sub
When running the code, it indicated that many of the keys could not be found when searching the data set, but as noted above, there is only one that does not trace.
I don't know why it's saying some keys can't be found, and why it is taking so long to search through the list.
As a note, I did get .find() to work while at the office (this is a reproduction of the code) but it was still very slow, and I feel that there has to be a faster way.
Performing operations (including search) on cell takes time. Executing the search many times (30099 * 30099) in your code is time-consuming.
It would be more efficient to validate data using an array in memory.
It only takes 0.13s to validate 40K rows based on my testing.
Option Explicit
Sub Demo()
Dim objDic As Object
Dim i As Long, sKey As String
Dim arrData3, arrData2, lastRow As Long
Dim WS2 As Worksheet, WS3 As Worksheet
' Set WS2 = Workbooks("Book2.xlsx").Sheets(1)
' Set WS3 = Workbooks("Book3.xlsx").Sheets(1)
Set WS2 = ActiveWorkbook.Sheets(1) ' for testing
Set WS3 = ActiveWorkbook.Sheets(2)
Set objDic = CreateObject("scripting.dictionary")
lastRow = WS3.Cells(WS3.Rows.Count, 1).End(xlUp).Row
arrData3 = WS3.Range("A2:A" & lastRow).Value
For i = LBound(arrData3) To UBound(arrData3)
sKey = arrData3(i, 1)
If Not objDic.exists(sKey) Then
objDic(sKey) = i + 1
End If
Next i
arrData2 = WS2.Range("A2:A" & lastRow).Value
For i = LBound(arrData2) To UBound(arrData2)
sKey = arrData2(i, 1)
If Not objDic.exists(sKey) Then
Debug.Print "Row : " & i + 1 & ", Value: " & sKey
End If
Next i
End Sub