Search code examples
compilationcomwindows-installerinstallshielddependency-management

Scan for COM references to help generate Reg-Free COM manifests


I am re-implementing a large installation project as an MSI installer in InstallShield 2015 (it was formerly an InstallScript installer from a much older version of InstallShield). One of the many improvements I would like to make is to allow all COM references to be resolved in a registration free manner by using manifest files and following registration free COM activation patterns. I think this isn't possible in all cases because we have some MMC snap-ins, which, as far as I know, can't use reg-free COM, but I would like to get as close as possible.

The challenge is figuring out all the manifests I need to create. We deliver thousands of files. I'm looking for a utility that can help scan the files to determine:

  1. Which files represent COM servers, and what CLSIDs do they contain
  2. Which files represent COM clients, and what CLSIDs do they reference

Hopefully with this information I can then use InstallShield's Reg-Free COM wizard to generate the necessary manifest files.

Is there anything already out there to help with this? If there isn't already a utility for this I'm going to try writing my own by:

  1. Looking at some known CLSIDs in COM servers and trying to identify some signature bytes around them that might help me determine how I can identify CLSIDs within COM servers in general.
  2. Write code to look for this signature in all our binary files and pick up all the CLSIDs
  3. Search all our binary files for the known CLSIDs and assume those that come from outside the servers represent clients.

If it turns out we are using ProgId's and not just CLSIDs, I may have to adjust my strategy, but I am hoping this would cover a large portion of our COM references.

If there's already any utility that might help with this, I'd like to use it rather than write my own. So I'm looking for any tips on writing this myself or tips on finding where this is already done.


Solution

  • With the help of a deleted answer to How to read TLB (type libraries) of unmanaged code from C#?, which led me to the C# source code of a type library reading class at http://clrinterop.codeplex.com/releases/view/17579, I was able to read type libraries to pick up CLSID values (skipping files if the attempt to load a type library failed - just tried to load type a library from every executable file). Then I did a simple search through all the executable files to look for those CLSID values. I'm not sure how complete the results are, but they look somewhat valid at least.

    The GetDirCLSIDRefs function is the top-level function that returns a list of executable filenames and, for each one, a list of associated files whose type libraries contain CLSIDs referenced by the file.

    Private Function GetDirCLSIDRefs(Directory As String, CLSIDs As KeyValuePair(Of Guid, String)()) As KeyValuePair(Of String, String())()
       Dim result As New List(Of KeyValuePair(Of String, String()))
       Dim entries = New System.IO.DirectoryInfo(Directory).GetFileSystemInfos()
       For Each entry In entries
          If TypeOf entry Is System.IO.FileInfo AndAlso _
             (entry.Name.EndsWith(".exe") OrElse _
              entry.Name.EndsWith(".dll") OrElse _
              entry.Name.EndsWith(".ocx")) Then
             Dim refs = ScanForCLSIDs(entry.FullName, CLSIDs)
             If refs IsNot Nothing AndAlso refs.Length > 0 Then result.Add(New KeyValuePair(Of String, String())(entry.FullName, refs))
          ElseIf TypeOf entry Is System.IO.DirectoryInfo Then
             result.AddRange(GetDirCLSIDRefs(entry.FullName, CLSIDs))
          End If
       Next
       Return result.ToArray()
    End Function
    
    Private Function GetDirCLSIDs(Directory As String) As KeyValuePair(Of Guid, String)()
       Dim result As New List(Of KeyValuePair(Of Guid, String))
       Dim entries = New System.IO.DirectoryInfo(Directory).GetFileSystemInfos()
       For Each entry In entries
          If TypeOf entry Is System.IO.FileInfo AndAlso _
             (entry.Name.EndsWith(".exe") OrElse _
              entry.Name.EndsWith(".dll") OrElse _
              entry.Name.EndsWith(".ocx")) Then
             For Each clsid In GetCLSIDs(entry.FullName)
                result.Add(New KeyValuePair(Of Guid, String)(clsid, entry.FullName))
             Next
          ElseIf TypeOf entry Is System.IO.DirectoryInfo Then
             result.AddRange(GetDirCLSIDs(entry.FullName))
          End If
       Next
       Return result.ToArray()
    End Function
    
    Private Function GetCLSIDs(FileName As String) As Guid()
       Dim tlb = TypeLibTypes.Interop.TypeLib.Load(FileName)
       If tlb.GetTypeLib() Is Nothing Then Return {}
       Dim result As New List(Of Guid)
       Try
          For typeIndex As Integer = 0 To tlb.GetTypeInfoCount() - 1
             Using attr = tlb.GetTypeInfo(typeIndex).GetTypeAttr()
                If (attr.wTypeFlags And TypeLibTypes.Interop.TYPEFLAGS.TYPEFLAG_FHIDDEN) = 0 _
                   AndAlso attr.typekind = TypeLibTypes.Interop.TYPEKIND.TKIND_COCLASS Then
                   result.Add(attr.Guid)
                End If
             End Using
          Next
       Catch ex As System.Runtime.InteropServices.COMException When ex.ErrorCode = &H80029C4A
          Return {}
       End Try
       Return result.ToArray()
    End Function
    
    Private Function ScanForCLSIDs(FileName As String, CLSIDFiles As KeyValuePair(Of Guid, String)()) As String()
       Dim content As Byte()
       Using file As New System.IO.FileStream(FileName, IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.ReadWrite)
          content = DirectCast(Array.CreateInstance(GetType(Byte), CInt(file.Length)), Byte())
          file.Read(content, 0, CInt(file.Length))
       End Using
       Dim result As New List(Of String)
       For Each CLSIDFile In CLSIDFiles
          If FindBytes(content, CLSIDFile.Key.ToByteArray()) >= 0 Then
             If Not result.Contains(CLSIDFile.Value) Then result.Add(CLSIDFile.Value)
          End If
       Next
       Return result.ToArray()
    End Function
    
    Private Function FindBytes(data As Byte(), pattern As Byte(), Optional start As Integer = 0) As Int32
       If pattern.Length = 0 Then Return -1
       If data.Length = 0 Then Return 0
       Dim search = ProcessSearchPattern(pattern)
       Dim i As Integer = start + pattern.Length - 1
       Do While i < data.Length
          Dim j As Integer = pattern.Length - 1
          Do While j >= 0 AndAlso data(i) = pattern(j)
             i -= 1
             j -= 1
          Loop
          If j < 0 Then Return i + 1
          i += Math.Max(search.Delta1(data(i)), search.Delta2(j))
       Loop
       Return -1
    End Function
    
    Private Class SearchPattern
       Public Delta1 As Integer()
       Public Delta2 As Integer()
    End Class
    
    Private Class PatternBytes
       Private bytes As Byte()
       Public Sub New(value As Byte())
          bytes = value
       End Sub
       Public Overrides Function GetHashCode() As Integer
          Dim i As Integer = 0
          GetHashCode = 0
          Do While i <= bytes.Length - 4
             GetHashCode = GetHashCode Xor BitConverter.ToInt32(bytes, i)
             i += 4
          Loop
          If i <= bytes.Length - 2 Then
             GetHashCode = GetHashCode Xor BitConverter.ToInt16(bytes, i)
             i += 2
          End If
          If i < bytes.Length Then
             GetHashCode = GetHashCode Xor bytes(i)
          End If
       End Function
    
       Public Overrides Function Equals(obj As Object) As Boolean
          Dim compare As Byte()
          If TypeOf obj Is PatternBytes Then
             compare = DirectCast(obj, PatternBytes).bytes
          ElseIf TypeOf obj Is Byte() Then
             compare = DirectCast(obj, Byte())
          Else
             Return False
          End If
          If compare.Length <> bytes.Length Then Return False
          For i As Integer = 0 To bytes.Length - 1
             If bytes(i) <> compare(i) Then Return False
          Next
          Return True
       End Function
    End Class
    
    Private Function ProcessSearchPattern(pattern As Byte()) As SearchPattern
       Static cache As New Dictionary(Of PatternBytes, SearchPattern)
       ProcessSearchPattern = Nothing
       Dim pb As New PatternBytes(pattern)
       If cache.TryGetValue(pb, ProcessSearchPattern) Then Exit Function
       ProcessSearchPattern = New SearchPattern()
       Dim patLen = pattern.Length
       ProcessSearchPattern.Delta1 = DirectCast(Array.CreateInstance(GetType(Integer), Byte.MaxValue + 1), Integer())
       ProcessSearchPattern.Delta2 = DirectCast(Array.CreateInstance(GetType(Integer), patLen), Integer())
       Dim c As Integer ' Cannot run a For loop to Byte.MaxValue in a Byte!
       Dim i As Integer
       For c = 0 To Byte.MaxValue
          ProcessSearchPattern.Delta1(c) = patLen
       Next
       For i = 0 To patLen - 2
          ProcessSearchPattern.Delta1(pattern(i)) = patLen - 1 - i
       Next
    
       Dim m As Integer
       Dim lastPrefixIndex = patLen - 1
       Dim isPrefix As Boolean = False
       For i = patLen - 1 To 0 Step -1
          isPrefix = True
          If patLen - i - 2 > 0 Then
             For m = 0 To patLen - i - 2
                If pattern(m) <> pattern(m + i + 1) Then
                   isPrefix = False
                   Exit For
                End If
             Next
          End If
          If isPrefix Then lastPrefixIndex = i + 1
          ProcessSearchPattern.Delta2(i) = lastPrefixIndex + (patLen - 1 - i)
       Next
    
       For i = 0 To patLen - 2
          For m = 0 To i - 1
             If pattern(i - m) = pattern(patLen - 1 - m) Then
                Exit For
             End If
             If pattern(i - m) <> pattern(patLen - 1 - m) Then
                ProcessSearchPattern.Delta2(patLen - 1 - m) = patLen - 1 - i + m
             End If
          Next
       Next
    
       cache.Add(pb, ProcessSearchPattern)
    End Function