Search code examples
vb.netcomparisonfilestream

File comparison in VB.Net


I need to know if two files are identical. At first I compared file sizes and creation timestamps, but that's not reliable enough. I have come up with the following code, that seems to work, but I'm hoping that someone has a better, easier or faster way of doing it.

Basically what I am doing, is streaming the file contents to byte arrays, and comparing thier MD5 hashes via System.Security.Cryptography.

Before that I do some simple checks though, since there is no reason to read through the files, if both file paths are identical, or one of the files does not exist.

Public Function CompareFiles(ByVal file1FullPath As String, ByVal file2FullPath As String) As Boolean

    If Not File.Exists(file1FullPath) Or Not File.Exists(file2FullPath) Then
        'One or both of the files does not exist.
        Return False
    End If

    If String.Compare(file1FullPath, file2FullPath, True) = 0 Then
        ' fileFullPath1 and fileFullPath2 points to the same file...
        Return True
    End If

    Dim MD5Crypto As New MD5CryptoServiceProvider()
    Dim textEncoding As New System.Text.ASCIIEncoding()

    Dim fileBytes1() As Byte, fileBytes2() As Byte
    Dim fileContents1, fileContents2 As String
    Dim streamReader As StreamReader = Nothing
    Dim fileStream As FileStream = Nothing
    Dim isIdentical As Boolean = False

    Try

        ' Read file 1 to byte array.
        fileStream = New FileStream(file1FullPath, FileMode.Open)
        streamReader = New StreamReader(fileStream)
        fileBytes1 = textEncoding.GetBytes(streamReader.ReadToEnd)
        fileContents1 = textEncoding.GetString(MD5Crypto.ComputeHash(fileBytes1))
        streamReader.Close()
        fileStream.Close()

        ' Read file 2 to byte array.
        fileStream = New FileStream(file2FullPath, FileMode.Open)
        streamReader = New StreamReader(fileStream)
        fileBytes2 = textEncoding.GetBytes(streamReader.ReadToEnd)
        fileContents2 = textEncoding.GetString(MD5Crypto.ComputeHash(fileBytes2))
        streamReader.Close()
        fileStream.Close()

        ' Compare byte array and return result.
        isIdentical = fileContents1 = fileContents2

    Catch ex As Exception

        isIdentical = False

    Finally

        If Not streamReader Is Nothing Then streamReader.Close()
        If Not fileStream Is Nothing Then fileStream.Close()
        fileBytes1 = Nothing
        fileBytes2 = Nothing

    End Try

    Return isIdentical
End Function

Solution

  • I would say hashing the file is the way to go, It's how I have done it in the past.

    Use Using statements when working with Streams and such, as they clean themselves up. Here is an example.

    Public Function CompareFiles(ByVal file1FullPath As String, ByVal file2FullPath As String) As Boolean
    
    If Not File.Exists(file1FullPath) Or Not File.Exists(file2FullPath) Then
        'One or both of the files does not exist.
        Return False
    End If
    
    If file1FullPath = file2FullPath Then
        ' fileFullPath1 and fileFullPath2 points to the same file...
        Return True
    End If
    
    Try
        Dim file1Hash as String = hashFile(file1FullPath)
        Dim file2Hash as String = hashFile(file2FullPath)
    
        If file1Hash = file2Hash Then
            Return True
        Else
            Return False
        End If
    
    Catch ex As Exception
        Return False
    End Try
    End Function
    
    Private Function hashFile(ByVal filepath As String) As String
        Using reader As New System.IO.FileStream(filepath, IO.FileMode.Open, IO.FileAccess.Read)
            Using md5 As New System.Security.Cryptography.MD5CryptoServiceProvider
                Dim hash() As Byte = md5.ComputeHash(reader) 
                Return System.Text.Encoding.Unicode.GetString(hash) 
            End Using
        End Using
    End Function