Search code examples
vbaweb-scrapingproxytimeoutserverxmlhttp

Unable to set customized timeout within ServerXMLHTTP request


I've written a script in vba to scrape the first post from a website after making a proxied request. I've used proxy (out of list of proxies) while making http request within my vba script in order to check out the length of total posts. When a request is successfully sent, the script should parse the first post and the proxy being used and exit the loop.

Sometimes the script work in the right way but most of the times the script takes ages to complete the operation even when I've defined timeout before sending request. At this point I'm highly dubious as to whether I could fill in the timeout parameter in the right way. What I expect is that the script will wait upto that time for any response, othrwise it will throw timeout error and go for the next request.

I've written so far:

Sub HandleTimeOut()
    Dim Http As New ServerXMLHTTP60, Html As New HTMLDocument
    Dim elem As Object, proxyList As Variant, oProxy As Variant

    proxyList = [{"50.246.120.125:8080","198.204.253.115:3128","98.172.142.99:8080","207.188.231.141:8080"}]

    For Each oProxy In proxyList
        With Http
            .Open "GET", "https://stackoverflow.com/questions/tagged/web-scraping", True
            .setRequestHeader "User-Agent", "Mozilla/5.0"
            .setProxy 2, oProxy
            .setTimeouts 600000, 600000, 15000, 15000
            On Error Resume Next
            .send
            While .readyState < 4: DoEvents: Wend
            Html.body.innerHTML = .responseText
            Set elem = Html.querySelectorAll(".summary .question-hyperlink")
            On Error GoTo 0
        End With

        If elem.Length > 0 Then
            [A1] = oProxy
            [B1] = elem(0).innerText
            Exit For
        End If
    Next oProxy
End Sub

What is the right way to set timeout for five seconds?


Solution

  • .Open "GET", "https://stackoverflow.com/questions/tagged/web-scraping", True
    

    should be

    .Open "GET", "https://stackoverflow.com/questions/tagged/web-scraping", False
    

    how to set http timeout using asp?