In Excel VBA to load a website and get it into a sheet I have been using the following:
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE .navigate "https://www.wsj.com/market-data/bonds/treasuries"
And then I can copy and paste it into my Excel sheet. But this website no longer works with IE11, and Excel VBA insists on using IE11 even though it is about to be deprecated.
Is there another way? I have also looked at:
Selenium: but it seems to be pretty much obsolete for VBA (not updated since 2016) and I couldn’t get it to work with Edge or Firefox in VBA anyway.
AutoIt: I got it to write the website’s HTML code to a TXT file (oHTTP = ObjCreate("winhttp.winhttprequest.5.1") ; $oHTTP.Open("GET", $URL1, False) ; $oHTTP.Send(); $oReceived = $oHTTP.ResponseText; FileWrite($file, $oReceived)) but the txt file contents are far from convenient as there is endless HTML stuff in it. It’ll take a fair amount of VBA code to sort through the mess, which probably means it won’t be reliable going forward. Also given the size of my workbook which is very slow, it will take literally several minutes to copy the website data into a sheet element by element.
Surely there must be an easy way to load the site, or just the table within the site, into an Excel sheet? This must be a well trodden path, but after much googling I can’t find an easy solution that actually works.
I have a 5-10 web pages being loaded into this workbook, and it seems to be a full time job keeping the whole thing working!! Any thoughts/help very much appreciated!!!
Similar idea to Christopher's answer in using regex. I am grabbing the instruments data (JS array), splitting the component dictionaries out (minus the end }
), and then use regex, based on headers, to grab the appropriate values.
I use a dictionary to handle input/output headers, and set a couple of request headers to help to signal browser based request and to mitigate for being served cached results.
Ideally, one would use an html parser and grab the script
tag, then use a json parser on the JavaScript object within the script
tag.
If you want the data from the other tabbed results, I can add that in by explicitly setting re.Global = True
, then looping the returned matches. Depends whether you want those and how you want them to appear in the sheet(s).
I currently write results out to a sheet called Treasury Notes & Bonds
.
Option Explicit
Public Sub GetTradeData()
Dim s As String, http As MSXML2.XMLHTTP60 'required reference Microsoft XML v6,
Set http = New MSXML2.XMLHTTP60
With http
.Open "GET", "https://www.wsj.com/market-data/bonds/treasuries", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
s = .responseText
End With
Dim re As VBScript_RegExp_55.RegExp 'required reference Microsoft VBScript Regular Expressions
Set re = New VBScript_RegExp_55.RegExp
re.Pattern = "instruments"":\[(.*?)\]"
s = re.Execute(s)(0).SubMatches(0)
Dim headers() As Variant, r As Long, c As Long, mappingDict As Scripting.Dictionary 'required reference Microsoft Scripting Runtime
Set mappingDict = New Scripting.Dictionary
mappingDict.Add "maturityDate", "MATURITY"
mappingDict.Add "coupon", "COUPON"
mappingDict.Add "bid", "BID"
mappingDict.Add "ask", "ASKED"
mappingDict.Add "change", "CHG"
mappingDict.Add "askYield", "ASKED YIELD"
headers = mappingDict.keys
Dim results() As String, output() As Variant, key As Variant
results = Split(s, "}")
ReDim output(1 To UBound(results), 1 To UBound(headers) + 1)
For r = LBound(results) To UBound(results) - 1
c = 1
For Each key In mappingDict.keys
re.Pattern = "" & key & """:""(.*?)"""
output(r + 1, c) = re.Execute(results(r))(0).SubMatches(0)
c = c + 1
Next
Next
re.Pattern = "timestamp"":""(.*?)"""
re.Global = True
With ThisWorkbook.Worksheets("Treasury Notes & Bonds")
.UsedRange.ClearContents
Dim matches As VBScript_RegExp_55.MatchCollection
Set matches = re.Execute(http.responseText)
.Cells(1, 1) = matches(matches.Count - 1).SubMatches(0)
.Cells(2, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(3, 1).Resize(UBound(output, 1), UBound(output, 2)) = output
End With
End Sub