Search code examples
htmlexcelvbaselenium-webdriverinternet-explorer

Extract data from an HTML table on a Web page into Excel


I need some help with taking data from website using vba. I have table in excel with etf ticker, link and price and using vba I try to take close price of yesterday from each link but the problem is that I am not sure what to write at this place "HTMLDoc.getElementById("???")". I can't find any ID from that website html code, hope you can help me with that.

website example: https://www.boerse-frankfurt.de/en/etf/amundi-prime-global-ucits-etf-dr-c/price-history/historical-prices-and-volumes

my code:


Sub GetETFPrices()
    Dim IE As New InternetExplorer
    Dim HTMLDoc As HTMLDocument
    Dim ETFLink As String
    Dim ETFPrice As String
    Dim i As Long

    ' Loop through each row in the table
    For i = 2 To ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).row
        ETFLink = ThisWorkbook.Worksheets("Sheet1").Cells(i, 2).Value ' Get link from column 2
        
        ' Opening the link in Internet Explorer
        IE.Navigate ETFLink

        
        Do While IE.Busy
            DoEvents
        Loop

        ' Get the ETF price element from the HTML document
        Set HTMLDoc = IE.document
        Set ETFRow = HTMLDoc.getElementById("???").Rows(HTMLDoc.getElementById("???").Rows.Length - 1)
        ETFPrice = ETFRow.Cells(3).innerText

        ' Updating the price column in Excel sheet
        ThisWorkbook.Worksheets("Sheet1").Cells(i, 3).Value = ETFPrice
    Next i

    IE.Quit
End Sub

Solution

  • If you don't have to use Internet Explorer, I would suggest to use Selenium which is compatible with Chrome (and FireFox).

    Now, looking at the page you linked, this is the case where your data is inside a table and there is no id attribute that will allow you to target a specific cell. Because of that, I would suggest to import the whole table from the page in Excel and then you can more easily navigate to find the data point you are interested in.

    You could use this approach for instance:

    Sub ImportHtmlTable()
    
        'Initialize Selenium
        Dim bot As WebDriver
        Set bot = New WebDriver
        
        bot.Start "chrome", "YourUrl"
        bot.Get "/"
        DoEvents
    
        Dim Tables As WebElements
        Set Tables = bot.FindElementsByTag("table")
    
        Dim wb As Workbook
        Set wb = Workbooks("YourWorkbookName")
        HtmlTablesToRange Tables, wb.Sheets(1).Range("A1")
        
        bot.Close
        Set bot = Nothing
    
    End Sub
    
    'Inspired by code from: https://www.vba-market.com/
    Sub HtmlTablesToRange(Tables As WebElements, Destination As Range)
    
        Destination.CurrentRegion.ClearContents
    
        Dim tb As WebElement
        Dim ths As WebElements 'Headers    (th)
        Dim trs As WebElements 'Rows       (tr)
        Dim tds As WebElements 'Data cells (td)
        
        For Each tb In Tables
        
            Dim theads As WebElements
            Set theads = tb.FindElementsByTag("thead")
            Dim thead As WebElement
            For Each thead In theads
                Set trs = thead.FindElementsByTag("tr")
                Dim tr As WebElement
                For Each tr In trs
                    Set ths = tr.FindElementsByTag("th")
                    Dim y As Long, z As Long
                    y = 0 ' Resets back to first column
                    Dim th As WebElement
                    For Each th In ths
                        Destination.Offset(z, y).Value = th.text
                        y = y + 1
                    Next th
                    z = z + 1
                Next tr
            Next thead
     
            Dim tbodys As WebElements
            Set tbodys = tb.FindElementsByTag("tbody")
            Dim tbody As WebElement
            For Each tbody In tbodys
                Set trs = tbody.FindElementsByTag("tr")
                For Each tr In trs
                    Set tds = tr.FindElementsByTag("td")
                    y = 0 ' Resets back to first column
                    Dim td As WebElement
                    For Each td In tds
                        Destination.Offset(z, y).Value = td.text
                        y = y + 1
                    Next td
                    z = z + 1
                Next tr
            Next tbody
            z = z + 1
        Next tb
    
    End Sub
    
    

    If you need to improve performance, you can also turn off Application.ScreenUpdating during the execution of the code.

    Note that you might need to update the Chrome driver usually located in C:\Users\YourUserName\AppData\Local\SeleniumBasic

    Disclaimer: Always make sure that you are allowed to gather data from the website you are interested in.