Search code examples
vbabinaryfilecompare

VBA Function to test if two binary (PDF) files are identical?


I am looking for a performant and simple function that I can use in a VBA application that will simply return true or false, depending on whether the files are identical (except for their name.)

If sample.pdf is copied to sample_CopyOf.pdf, then

CompareFiles("sample.pdf","sample_CopyOf.pdf") = True

Otherwise, False.
I don't care what the differences are, just that the files aren't copies of each other.

The cmd line equivalent would be fc /c sample.pdf sampl_CopyOf.pdf returning the line

FC: no difference encountered.

It feels like there would be a Win32 API function that I could use, but one didn't jump out. And if seems like reading each file and comparing (binary) strings would be slow.


Solution

  • I am rather sure that there is no faster way to compare files as by using the command the OS provides. You can use fc for that, the trick is to let the command stop as soon as possible if you only want to know if the files are identical or not. This can be done using the switch /Lb<nnn> and set n=1 (stop after one line of difference). It is important that you don't use the binary option (/B), otherwise the /Lb parameter is ignored (as there are no "lines" in a binary file).

    Alternative would be to use the Comp command. This command will return immediately when the files have different sizes and stops when 10 bytes of difference are found. However, this command asks you if you want to compare more files and waits for Y or N input. For cmd-files, the trick is to issue a echo N and pipe that as input: echo N | comp file1 file2, but it is tricky (or impossible? I couldn't figure it out) to issue a command including pipes from VBA.

    I wrapped a small function around the fc command that uses the Run-method of WScript.Shell. Note the third parameter of the method: It needs to be set to True so that VBA waits until the command is complete and therefore the return value reflects the errorLevel that is returned by the fc-command.

    Function CompareFiles(f1Name As String, f2Name As String) As Boolean
        ' Returns True if files are identical
        Const waitOnReturn As Boolean = True
        Static shell As Object
        If shell Is Nothing Then Set shell = VBA.CreateObject("WScript.Shell")
    
        Dim cmd As String
        cmd = "fc /Lb1 " & """" & f1Name & """ """ & f2Name & """"
        
        Dim errorLevel As Integer
        errorLevel = shell.Run(cmd, vbHide, waitOnReturn)
        ' errorLevel:
        '     0 if files are identical
        '     1 if files are different
        '     2 if one of the files was missing
        CompareFiles = (errorLevel = 0)
        
    End Function
    

    Running this on files that are different returns instantly (independent of the size).
    Comparing 2 identical files with 50MB takes less than 1s.
    Comparing 2 identical files with nearly 1GB was executed in approx 10s on my computer.