Hi there, hoping that someone can help me. On this example link: https://www.academiadasapostas.com/stats/competition/brasil/26
I want to get all the href links which are the target of the "VS". I'm trying examples like this one:
Sub ScrapeScores()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLTables As MSHTML.IHTMLElementCollection
Dim HTMLTable As MSHTML.IHTMLElement
Dim HTMLDiv As MSHTML.IHTMLElement
Dim TableSection As MSHTML.IHTMLElement
Dim TableRow As MSHTML.IHTMLElement
Dim TableCell As MSHTML.IHTMLElement
Dim RowText As String
IE.Visible = True
IE.navigate "https://www.academiadasapostas.com/stats/competition/brasil/26"
Do While IE.readyState <> READYSTATE_COMPLETE Or IE.Busy
Loop
Set HTMLDoc = IE.document
Set HTMLDiv = HTMLDoc.getElementById("competition-round-group-0")
Set HTMLTables = HTMLDiv.getElementsByTagName("a")
For Each HTMLTable In HTMLTables
Debug.Print HTMLTable.ID, "&", HTMLTable.className
For Each TableSection In HTMLTable.Children
Debug.Print , TableSection.tagName
Next TableSection
Next HTMLTable
End Sub
But with no success. I think I could use CSS with a SelectorAll, right? Since IE is going to get extinted, would be nice to use the CSS instead.
Thank you in advance for any answer.
You can use the following css pattern with querySelectorAll .competition-rounds td:nth-child(4) > a
. Loop the returned nodeList
and extract the href
from each node. This selects for the 4th column within that table, then the child a
tag, from which the href
attribute is extracted during the loop.
Required references:
Option Explicit
Public Sub PrintLinks()
Dim ie As SHDocVw.InternetExplorer, nodeList As MSHTML.IHTMLDOMChildrenCollection
Set ie = New SHDocVw.InternetExplorer
With ie
.Visible = True
.Navigate2 "https://www.academiadasapostas.com/stats/competition/brasil/26"
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set nodeList = ie.Document.querySelectorAll(".competition-rounds td:nth-child(4) > a")
Dim i As Long
For i = 0 To nodeList.length - 1
Debug.Print nodeList.Item(i).href
Next
Stop
.Quit
End With
End Sub
Reading: