I've created a script in VBA to scrape breadcrumbs from a webpage using XML HTTP requests. The script works well when I implement the ActiveX component MSXML2.XMLHTTP.6.0
, but it fails miserably when I switch to MSXML2.serverXMLHTTP.6.0
.
As I've got a plan to use proxies within the script, it is necessary that I stick with MSXML2.serverXMLHTTP.6.0
. However, the seond script doesn't work. To let you know, when I print the .responseText
, I see gibberish content within it, as in the following:
??GN?!v:h??_??Og<]?????X ?6??'o??F??6 ?uh????x?r???????sP??????????[B??k????]??????yC????'???L???????,*?Z????? ?vX ?c?q\t?j??????K?|???P 7??k?y?<;?>????a?*P1????w???[?T?/f?? ?7?gn??V<E?Z??6t:??1??????E'v?1?? ?w??+??????-aD????wy?
.
Using MSXML2.XMLHTTP.6.0
(works flawlessly):
Option Explicit
Sub GrabInfo()
Const Url$ = "https://www.amazon.com/gp/product/B00FQT4LX2?th=1"
Dim oHttp As Object, Html As HTMLDocument, breadCrumbs$
Set Html = New HTMLDocument
Set oHttp = CreateObject("MSXML2.XMLHTTP.6.0")
With oHttp
.Open "GET", Url, True
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36"
.send
While .readyState < 4: DoEvents: Wend
MsgBox "Status code: " & .Status
Html.body.innerHTML = .responseText
breadCrumbs = Html.querySelector("#wayfinding-breadcrumbs_feature_div")
MsgBox breadCrumbs
End With
End Sub
Using MSXML2.serverXMLHTTP.6.0
(throwing an error showing Object Variable or With block variable not set
):
Option Explicit
Sub GrabInfo()
Const Url$ = "https://www.amazon.com/gp/product/B00FQT4LX2?th=1"
Dim oHttp As Object, Html As HTMLDocument, breadCrumbs$
Set Html = New HTMLDocument
Set oHttp = CreateObject("MSXML2.serverXMLHTTP.6.0")
With oHttp
.Open "GET", Url, True
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36"
.send
While .readyState < 4: DoEvents: Wend
MsgBox "Status code: " & .Status
Html.body.innerHTML = .responseText
breadCrumbs = Html.querySelector("#wayfinding-breadcrumbs_feature_div")
MsgBox breadCrumbs
End With
End Sub
How can I make the second script built on
MSXML2.serverXMLHTTP.6.0
work?
I tried your code and got the same "glibberish" response. However, removing the statement setRequestHeader
(where you set the user agent) solved that issue and the response was readable, while using a different User agent (the one suggested here) resulted in the strange response.
Note that the result of Html.querySelector
is an object, not a string and you should use
Dim breadCrumbs as Object
Set breadCrumbs = Html.querySelector("#wayfinding-breadcrumbs_feature_div")
MsgBox breadCrumbs.innerHTML