Search code examples
vbaexcelfile-processing

Keeping Track of Excel Files Which Have Already been Processed VBA


I have thousands of excel files in sub folders which I need to pull data from, and compile it all in a master sheet. The files typically come in day by day over several months.

I have already built a macro which accomplishes all this.

What I am having trouble with is re-running the macro periodically to include new files.

I need a method of keeping track of which files have already been processed, and which have not. So far my solution has been to use the filename as a unique key and compare each file the macro runs on to a list of unique keys saved on a second worksheet. As you can imagine, this is incredibly slow and I'm wondering if there is a better solution.

Below is the macro I built:

Option Explicit

    Sub PullInspectionData()
        Dim fso, oFolder, oSubfolder, oFile, queue As Collection
        Dim n, i As Long
        Dim wb, mwb As Workbook
        Dim wsb, mws, cws As Worksheet
        Dim DefectCode, Level, LocationComments As String
        Dim PhaseName As String

        'Optimize Macro Speed
          Application.ScreenUpdating = False
          Application.EnableEvents = False
          Application.Calculation = xlCalculationManual
          Application.AskToUpdateLinks = False
          Application.DisplayAlerts = False

        'Creates the collection of files within the subfolder
        Set fso = CreateObject("Scripting.FileSystemObject")
        Set queue = New Collection
        Set mwb = ActiveWorkbook
        Set mws = mwb.Worksheets("Inspection Data")
        Set cws = mwb.Worksheets("Macro Controls")
        RowNumber = Worksheets("Inspection Data").Range("A:A").Cells.SpecialCells(xlCellTypeConstants).Count + 1
        queue.Add fso.GetFolder("filepath") 'obviously replace
        DoEvents

        Do While queue.Count > 0
            Set oFolder = queue(1)
            queue.Remove 1 'dequeue
            DoEvents


            '...insert any folder processing code here...


            For Each oSubfolder In oFolder.subfolders
                queue.Add oSubfolder 'enqueue
                DoEvents
            Next oSubfolder
            DoEvents
            For Each oFile In oFolder.Files
            On Error Resume Next
            DoEvents


' Operate on each file
'This keeps track of which files have already been processed
                n = Worksheets("MacroControls").Range("A:A").Cells.SpecialCells(xlCellTypeConstants).Count
                For i = 1 To n
                If Cells(i, 1).Value = oFile.Name Then
                GoTo SkipLoop
                DoEvents
                End If
                Next i
                DoEvents

                'Actually begins to Copy Information to the Master File
                Cells(i, 1).Value = oFile.Name
                Set wb = Workbooks.Open(oFile)
                Set wsb = wb.Worksheets("PO Data")
                DoEvents

'file processing code removed for simplicity    

               DoEvents
               wb.Close SaveChanges:=False
               DoEvents
    SkipLoop:
            Next oFile
            DoEvents
        Loop

    'Reset Macro Optimization Settings
        Application.EnableEvents = True
        Application.ScreenUpdating = True
        Application.Calculation = xlCalculationAutomatic
        Application.DisplayAlerts = True
        Application.AskToUpdateLinks = True

    End Sub

The "Inspection Data" worksheet is the master file where all the information is compiled, and the "Macro Controls" worksheet contains the list of files that have already been operated on.

How can I speed things up?


Solution

  • Your process is so slow because you loop through each filename in your control sheet each time. That's insane.

    Try something like this:

    For Each oFile In oFolder.Files
    
        If cws.Range("A:A").Find(oFile.Name, lookat:=xlWhole) is Nothing Then
    
             'process this file
    
        End If
    
    Next
    

    Another advantage of this approach is that it will eliminate your GoTo statements, which should only be used in case of real emergency.