Search code examples
excelvbaoptimizationmatchcomparison

Optimize VBA Script to Combine and Consolidate


I am working on optimizing this script since I am working with two large (~1M rows) worksheets, each and think my code is inefficient and takes way too long to run and wondering if I can redo it to make it faster.

These are the steps:

  • Combine Excel Sheet 1 and Sheet 2 using Column A as common identifier
  • Add a column to identify if Columns E = H (True or False)
  • Remove all True's (this should get rid of most rows, leaving a few hundred)

Also, what does this line exactly mean? in particular the Columns (1), A, :M and G - want to confirm its picking the right matches

iRow = Application.Match(ID, ws2.UsedRange.Columns(1), 0)
If Not IsError(iRow) Then ws2.Range("A" & iRow & ":M" & iRow).Copy ws3.Range("G" & r.Row)

Sheet 1:

Sheet 2:

Final Expected Result:

Sub TestGridUpdate()
  Dim ws1 As Worksheet, ws2 As Worksheet, ws3 As Worksheet
  Dim TestGridFound As Boolean, r As Range
  Set ws1 = ThisWorkbook.Worksheets("Sheet1")
  Set ws2 = ThisWorkbook.Worksheets("Sheet2")

  TestGridFound = False 'Look for TestGrid worksheet
  For Each ws In Worksheets
    If ws.Name = "Combined" Then TestGridFound = True
  Next

  If TestGridFound Then 'If Combined is found then use it else create it
    Set ws3 = ThisWorkbook.Worksheets("Combined")
    ws3.Cells.Clear
  Else
    Set ws3 = ThisWorkbook.Worksheets.Add(After:=ThisWorkbook.Worksheets(ThisWorkbook.Worksheets.Count))
    ws3.Name = "Combined"
  End If

  ws3.Range(ws1.UsedRange.Address).Value = ws1.UsedRange.Value 'Copy ws1 to ws3 (TestGrid)

  For Each r In ws3.UsedRange.Rows ' Add ws2 details to ws3 (TestGrid)
    ID = r.Cells(, 1).Value
    iRow = Application.Match(ID, ws2.UsedRange.Columns(1), 0)
    If Not IsError(iRow) Then ws2.Range("A" & iRow & ":M" & iRow).Copy ws3.Range("G" & r.Row)
  Next
End Sub

Sub FillFormula() 'Add a column to identify column matches
  'Set reference to the sheet in the workbook.
  Set ws = ThisWorkbook.Worksheets("Combined")
  ws.Activate 'not required but allows user to view sheet if warning message appears
  Range("N2").Formula = "=$E2=H2"
  Range("N2", "N" & Cells(Rows.Count, 1).End(xlUp).Row).FillDown
End Sub

Sub Delete_Rows_Based_On_Value() 'Delete all matches that are true'
  'Apply a filter to a Range and delete visible rows
  Dim ws As Worksheet
  Set ws = ThisWorkbook.Worksheets("Combined") 'Set reference to sheet in workbook.
  ws.Activate 'not required but allows user to view sheet if warning message appears

  On Error Resume Next 'Clear any existing filters
    ws.ShowAllData
  On Error GoTo 0

  ws.Range("A:P").AutoFilter Field:=14, Criteria1:="TRUE" '1. Apply Filter

  Application.DisplayAlerts = False '2. Delete Rows
    Sheets("Combined").AutoFilter.Range.Offset(1).Delete xlShiftUp
  Application.DisplayAlerts = True

  On Error Resume Next '3. Clear Filter
    ws.ShowAllData
  On Error GoTo 0
End Sub

Solution

  • answer to your question, the part:

      For Each r In ws3.UsedRange.Rows ' Add ws2 details to ws3 (TestGrid)
        ID = r.Cells(, 1).Value
        iRow = Application.Match(ID, ws2.UsedRange.Columns(1), 0)
        If Not IsError(iRow) Then ws2.Range("A" & iRow & ":M" & iRow).Copy ws3.Range("G" & r.Row)
      Next
    

    in a short way: it compares ws2 and w3 Column 1 - Column "A" values, if match is found, cell value from ws2 is copied to ws3. Application.Match(ID, ws2.UsedRange.Columns(1), 0) will cause error if cell values do not match, accordingly, that's why the following line looks as below: If Not IsError(iRow) Then ws2.Range("A" & iRow & ":M" & iRow).Copy ws3.Range("G" & r.Row) (perform with copying if no error).

    To be honest, I suspect this code runs above a minute or even two if you deal with around 1mil rows. Read and learn how to use arrays and how to assign ranges to arrays and how loop through them. Code will run MUCH faster as it will not be iterating through each actual cell on the excel, - all will be done in RAM memory (like in the virtual data table). No read/write (copy/paste) will be performed during the Array loop and at the end result will be written out in one step.

    Quick tip, while creating large arrays, use .value2 it will also improve performance. my_Arr1 = range("my_range").Value2 Once you will understand simple arrays, get your brains to wrap around 2d arrays, as all ranges loaded to an array will end up 2d.

    Examples to start from:

    http://www.cpearson.com/excel/vbaarrays.htm

    https://stackoverflow.com/a/23701283/8805842

    How to avoid using Select in Excel VBA

    https://stackoverflow.com/a/46954174/8805842

    https://stackoverflow.com/a/30067221/8805842

    For later read:

    https://stackoverflow.com/a/51524230/8805842

    https://stackoverflow.com/a/51608764/8805842