Search code examples
excelvbaweb-scrapingxmlhttprequest

VBA - web scraping can not find correct GET request


My question is related to other question VBA - web scraping can not get HTMLElement innerText. I have a similar problem

Website URL - https://www.hnb.hr/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list

I need to get the date of currency reference and the selected values. The problem is that I can not find a correct GET request where these values are finally generated. I've found that it is related to the POST request:

POST /en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list?p_p_id=tecajnalistacontroller_WAR_hnbtecajnalistaportlet&p_p_lifecycle=2&p_p_state=normal&p_p_mode=view&p_p_resource_id=getTecajnaAjaxDataURL&p_p_cacheability=cacheLevelPage&p_p_col_id=column-2&p_p_col_count=2 HTTP/1.1

I would like to use a technique with getting by id, class or tag - whatever but again, provided GET URL request is too quick to retrieve the required info


Solution

  • XMLHTTP request and API:

    I would use their API as shown below. I have some helper functions to aid with parsing the response. In GetDict function you can set the currencies you are interested in. In function GetRate you can specify the rate you are interested in. If you don't specify, it defaults to "median_rate".

    Calling the API:

    To get the rates for a particular date, make a[n] HTTP call to the following URL:

    http://hnbex.eu/api/v1/rates/daily/?date=YYYY-MM-DD

    The date parameter is optional. If not set, the current date (today) is used.

    You can parse the JSON response with a JSON parser but I found it simpler to go with using Split to grab the required info from the JSON string. If you are familiar with JSON I will happily update with a JSON parsing example.

    Option Explicit
    
    Public Sub GetInfo()
        'http://hnbex.eu/api/v1/
        Dim strJSON As String, http As Object, json As Object
        Const URL As String = "http://hnbex.eu/api/v1/rates/daily/"
    
        Set http = CreateObject("MSXML2.XMLHTTP")
        With http
            .Open "GET", URL, False
            .send
            strJSON = .responseText
        End With
        'Set json = JsonConverter.ParseJson(strJSON) '<== You could parse the JSON using a JSON parse such as [JSONConverter][1]
    
        Dim currencyDict As Object
        Set currencyDict = GetDict
    
        Dim key As Variant, dictKeys As Variant, result As Variant
        For Each key In currencyDict.keys
            result = GetRate(strJSON, key)
            If Not IsError(result) Then currencyDict(key) = result
            result = vbNullString
        Next key
    
        PrintDictionary currencyDict
    
    End Sub
    
    Public Function GetDict() As Object '<== You could adapt to pass currencies as string arguments to the function. Or even a string array.
        Dim dict As Object
        Set dict = CreateObject("Scripting.Dictionary")
        dict.Add "EUR", vbNullString
        dict.Add "CZK", vbNullString
        dict.Add "HRK", vbNullString
        dict.Add "HUF", vbNullString
        dict.Add "PLN", vbNullString
        dict.Add "RON", vbNullString
        dict.Add "RSD", vbNullString
        Set GetDict = dict
    End Function
    
    Public Function GetRate(ByVal json As String, ByVal key As Variant, Optional ByVal rate As String = "median_rate") As Variant
        Dim arr() As String, tempString As String
        On Error GoTo Errhand
        arr = Split(json, """currency_code"": " & Chr$(34) & key & Chr$(34))
        tempString = arr(1)
        tempString = Split(arr(1), Chr$(34) & rate & Chr$(34) & ":")(1)
        tempString = Split(tempString, ",")(0)
        GetRate = tempString
        Exit Function
    Errhand:
        GetRate = CVErr(xlErrNA)
    End Function
    
    Public Sub PrintDictionary(ByVal dict As Object)
        Dim key As Variant
        For Each key In dict.keys
            Debug.Print key & " : " & dict(key)
        Next
    End Sub
    

    Internet Explorer:

    You can use an loop with explicit wait for element to be present on page (or populated)

    Option Explicit
    Public Sub GetInfo()
        Dim IE As New InternetExplorer, t As Date, hTable As HTMLTable, clipboard As Object
        Const WAIT_TIME_SECS As Long = 5
        t = Timer
    
        With IE
            .Visible = True
            .navigate "https://www.hnb.hr/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list"
    
            While .Busy Or .readyState < 4: DoEvents: Wend
    
            Do
                DoEvents
                On Error Resume Next
                Set hTable = .document.getElementById("records_table")
                On Error GoTo 0
                If Timer - t > WAIT_TIME_SECS Then Exit Do
            Loop While hTable Is Nothing
    
            If hTable Is Nothing Then
                .Quit
                Exit Sub
            End If
            Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
            clipboard.SetText hTable.outerHTML
            clipboard.PutInClipboard
            ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
            .Quit                                    '<== Remember to quit application
        End With
    End Sub