Search code examples
excelvbamsxml2

Multiple css selectors vba


I have a case where I am trying to scrape multiple pages, but I noticed that the desired part to scrape is different sometimes and this makes me to use IF statements to check for the existence of object like that

Set obj = html.querySelector("div > blockquote > p > span > strong")
If obj Is Nothing Then
    Set obj = html.querySelector("div > blockquote > p > strong > span")
    If obj Is Nothing Then
        Set obj = html.querySelector("div > blockquote:nth-child(14) > p > strong")
        If obj Is Nothing Then
            Set obj = html.querySelector("div > blockquote:nth-child(13) > p > strong")
            If obj Is Nothing Then
                Set obj = html.querySelector("div > blockquote:nth-child(12) > p > strong")
            End If
        End If
    End If
End If

Is there an alternative and more reliable way to solve such a problem? You have to know there are more cases for the element


Solution

  • Without actual html to work with unsure if there are additional alternatives such as writing simpler/more transferable css selector lists.

    That said, here are two options I would consider. Option 1: For very long css selector lists. Reduce the complexity of your code and have the one level of nesting. Option 2: For shorter css selector lists, use OR syntax to test for alternate patterns.

    Having each alternate list on its own line, in one place, should aid with code maintenance over time.


    Dim tests() As Variant, test As Long
    
    tests = Array( _
    "div > blockquote > p > span > strong", _
    "div > blockquote > p > strong > span", _
    "div > blockquote:nth-child(14) > p > strong", _
    "div > blockquote:nth-child(13) > p > strong", _
    "div > blockquote:nth-child(12) > p > strong")
    
    '-------------------------   
    'Option 1: Single nested testing for longer selector list
    
    For test = LBound(tests) To UBound(tests)
        Set obj = HTML.querySelector(tests(test))
        If Not obj Is Nothing Then Exit For
    Next
    
    'Option 2: CSS OR syntax for shorter selector list
    
    Dim selectorList As String
    
    selectorList = Join$(tests, ",")
    Set obj = HTML.querySelector(selectorList)
    
    '--------------- then continue -------    
    If Not obj Is Nothing Then
     ' do something
    End If
    

    If going with Option 1 I might then go on to consider using a flag boolean variable

    Dim found As Boolean
    For test = LBound(tests) To UBound(tests)
        Set obj = html.querySelector(tests(test))
        found = Not obj Is Nothing
        If found Then Exit For
    Next
    
    If found Then
        'do something
    End If