Search code examples
vbaexcelmshtml

Late bind an IHTML element


I'm trying to create a late bound VBA project to search the web. At one point I have the following code (early bound):

Dim currPage as HTMLDocument: Set currPage = objIE.document 'where objIE is set with Set objIE = CreateObject("InternetExplorer.application")
'(late bound as it is dim'd as Object)
    Dim myDiv As HTMLDivElement: Set myDiv = currPage.getElementById("fbar")
    Dim elemRect As IHTMLRect: Set elemRect = myDiv.getBoundingClientRect
    'Scroll until bottom of page is in view
    Do Until elemRect.bottom > 0
        currPage.parentWindow.scrollBy 0, 10000
        Set elemRect = myDiv.getBoundingClientRect
    Loop

This code becomes this when late bound: (or so I thought)

Dim currPage as Object: Set currPage = objIE.document
    Dim myDiv As Object: Set myDiv = currPage.getElementById("fbar")
    Dim elemRect As Object: Set elemRect = myDiv.getBoundingClientRect
    'Scroll until bottom of page is in view
    Do Until elemRect.bottom > 0
        currPage.parentWindow.scrollBy 0, 10000
        Set elemRect = myDiv.getBoundingClientRect
    Loop

The problem, I'm guessing, lies with the I infront of IHTMLRect, which MSDN tells me denotes an element on a web page which doesn't have an actual object associated with it - consequently assigning it to an unspecified Object just makes no sense in the code. (That's a complete guess)

Anyway, the early bound code works fine, the late bound code quits execution at elemRect.bottom

Why is this and how can I fix it?


Solution

  • Objects in VBA can implement multiple interfaces and the methods/properties you can call depend on the interface that you are using to access the object. A simple example:

    ' This means access the object via the IUnknown interface
    ' IUnknown is the interface from which all other COM
    ' interfaces inherit
    Dim x As IUnknown
    Set x = ThisWorkbook.Worksheets(1)
    
    ' Commented out as this won't compile because the
    ' Name property isn't defined in IUnknown
    ' MsgBox x.Name
    
    ' This means access the object through the default
    ' interface associated with the Worksheet object type
    Dim w As Worksheet
    Set w = x
    
    ' Now we can get to the name (same object, different interface)
    MsgBox w.Name
    

    In the case of MSHTML, I would guess that methods like getElementById are returning an interface like one of the versions of IHTMLElement. This means that methods/properties defined in an interface like IHTMLDivElement can't be accessed.

    IUnknown has a method called QueryInterface which is used to get to the different interfaces which the object implements. This cannot be called directly in VBA, however, as the VBA way of doing this is by using Dim with the appropriate interface and then using Set. This will only compile if the necessary references have been set which in turn defeats the purpose of late binding.

    There is a workaround using CallByName. To go back to the worksheet example, this works:

    Dim x As IUnknown
    Set x = ThisWorkbook.Worksheets(1)
    
    ' Commented out as this won't compile because the
    ' Name property isn't defined in IUnknown
    ' MsgBox x.Name
    
    ' Can get to the property via CallByName
    MsgBox CallByName(x, "Name", VbGet)
    

    For the MSHTML issue, this works (note that the call type is changed to VbMethod):

    Dim elemRect As Object: Set elemRect = CallByName(myDiv, "getBoundingClientRect", 
        VbMethod)
    stTimer = Timer
    'Scroll until bottom of page is in view
    Do Until elemRect.bottom > 0 Or tElapsed > timeout 'timeout after n seconds
        currPage.parentWindow.scrollBy 0, 10000
        Set elemRect = CallByName(myDiv, "getBoundingClientRect", VbMethod)
        tElapsed = Timer - stTimer
    Loop
    

    I know very little about COM objects so there may well be other issues which I have failed to consider


    Full code (adapted from your answer to another question). Running the function repeatedly in quick succession produces errors due to IE taking time to shutdown (see this question for a similar issue). Re-use the same IE object if you need to run multiple queries in succession:

    Option Explicit
    
    Public Function GOOGLE_COUNT(searchTerm As String, xRes As Long, yRes As Long, Optional timeout As Long = 10) As Long
    
        Dim url As String
        Dim objIE As Object
        Dim currPage As Object
        Dim stTimer As Double, tElapsed As Single
        Dim valueResult As Object
    
        'create URL to page with these image criteria
        url = "https://www.google.com/search?q=" & searchTerm & _
                            "&tbm=isch&source=lnt&tbs=isz:ex,iszw:" & xRes & ",iszh:" & yRes
    
        'initiating a new instance of Internet Explorer and asigning it to objIE
        Set objIE = CreateObject("InternetExplorer.Application")
    
        'Google images search
        objIE.navigate url
        Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
        Set currPage = objIE.document
        Dim myDiv As Object: Set myDiv = currPage.getElementById("fbar")
        Dim elemRect As Object: Set elemRect = CallByName(myDiv, "getBoundingClientRect", VbMethod)
        stTimer = Timer
        'Scroll until bottom of page is in view
        Do Until elemRect.bottom > 0 Or tElapsed > timeout 'timeout after n seconds
            currPage.parentWindow.scrollBy 0, 10000
            Set elemRect = CallByName(myDiv, "getBoundingClientRect", VbMethod)
            tElapsed = Timer - stTimer
        Loop
        myDiv.ScrollIntoView
        'Count the images
        Set valueResult = currPage.getElementById("rg_s").getElementsByTagName("IMG")
        GOOGLE_COUNT = valueResult.Length
        objIE.Quit
    
    End Function
    
    Sub foo()
    
    MsgBox GOOGLE_COUNT("St. Mary", 1366, 768)
    
    End Sub