I'm trying to figure out how I can use .querySelector()
on .querySelectorAll()
.
For example, I get expected results when I try like this:
Sub GetContent()
Const URL$ = "https://stackoverflow.com/questions/tagged/web-scraping?tab=Newest"
Dim HTMLDoc As New HTMLDocument
Dim HTML As New HTMLDocument, R&, I&
With New XMLHTTP60
.Open "Get", URL, False
.send
HTMLDoc.body.innerHTML = .responseText
End With
With HTMLDoc.querySelectorAll(".summary")
For I = 0 To .Length - 1
HTML.body.innerHTML = .Item(I).outerHTML
R = R + 1: Cells(R, 1).Value = HTML.querySelector(".question-hyperlink").innerText
Next I
End With
End Sub
The script doesn't work anymore when I pick another site in order to grab the values under Rank
column available in the table even when I use the same logic:
Sub GetContent()
Const URL$ = "https://www.worldathletics.org/records/toplists/sprints/100-metres/outdoor/men/senior/2020?page=1"
Dim HTMLDoc As New HTMLDocument
Dim HTML As New HTMLDocument, R&, I&
With New XMLHTTP60
.Open "Get", URL, False
.send
HTMLDoc.body.innerHTML = .responseText
End With
With HTMLDoc.querySelectorAll("#toplists tbody tr")
For I = 0 To .Length - 1
HTML.body.innerHTML = .Item(I).outerHTML
R = R + 1: Cells(R, 1).Value = HTML.querySelector("td").innerText
Next I
End With
End Sub
This is the line Cells(R, 1).Value = HTML.querySelector().innerText
In both the script I'm talking about. I'm using the same within this container .querySelectorAll()
.
If I use .querySelector()
on .getElementsByTagName()
, I found it working. I also found success using TagName
on TagName
or ClassName
on ClassName
e.t.c. So, I can grab the content in few different ways.
How can I use .querySelector()
on .querySelectorAll()
in the second script in order for it to work?
Wrap it in table tags so the html parser knows what to do with it.
HTML.body.innerHTML = "<table>" & .Item(I).outerHTML & "</table>"
Doing so preserves the structure of the opening td tag which is otherwise stripped of the "<".