Search code examples
vbaexcelparsinghtml-parsinggetelementsbytagname

how to get the meta name keywords -vba


I am trying to get the meta name keywords from a webpage

meta name="keywords" content="Mitch Albom,For One More Day,Little, Brown Book Group,0751537535,Fiction / General,General & Literary Fiction,Modern & contemporary fiction (post c 1945),USA

I need to get the contents from it need help.

Option Explicit

Sub GetData()
    Dim ie As New InternetExplorer
    Dim str As String
    Dim wk As Worksheet
    Dim webpage As New HTMLDocument
    Dim item As HTMLHtmlElement

    Set wk = Sheet1
    str = wk.Range("Link").value
    ie.Visible = True

    ie.Navigate str

    Do
        DoEvents
    Loop Until ie.ReadyState = READYSTATE_COMPLETE

    Dim Doc As HTMLDocument
    Set Doc = ie.Document

    Dim kwd As String
    kwd = Trim(Doc.getElementsByTagName("keywords").innerText)
    MsgBox kwd

End Sub

Solution

  • The best way to do that is by finding meta-element with name keyword and referring to its content property. You can do it like that:

    Option Explicit
    
    Sub GetData()
        Dim ie As New InternetExplorer
        Dim str As String
        Dim wk As Worksheet
        Dim webpage As New HTMLDocument
        Dim item As HTMLHtmlElement
    
        Set wk = Sheet1
        str = wk.Range("Link").value
        ie.Visible = True
    
        ie.Navigate str
    
        Do
            DoEvents
        Loop Until ie.ReadyState = READYSTATE_COMPLETE
    
    
        'Find the proper meta element --------------
        Const META_TAG As String = "META"
        Const META_NAME As String = "keywords"
        Dim Doc As HTMLDocument
        Dim metaElements As Object
        Dim element As Object
        Dim kwd As String
    
    
        Set Doc = ie.Document
        Set metaElements = Doc.all.tags(META_TAG)
    
        For Each element In metaElements
            If element.Name = META_NAME Then
                kwd = element.Content
            End If
        Next
    
        MsgBox kwd
    
    End Sub