I am trying to extract some InnerText after a tag.
This is the HTML:
'<pre><a href="../">../</a>
<a href="view_10496.html">view_10496.html</a> 06-Feb-2021 01:54 60K
<a href="view_10498.html">view_10498.html</a> 06-Feb-2021 01:54 53K
<a href="view_10499.html">view_10499.html</a> 06-Feb-2021 01:54 26K
<a href="view_10500.html">view_10500.html</a> 06-Feb-2021 01:54 15K
<a href="view_10501.html">view_10501.html</a> 06-Feb-2021 01:54 128K
My code can pick up the content of the a tag but I also want to extract the text behind the a tag. The counter makes sure that I discard the first a tag.
Set alle_a_tags = ie.document.getElementsByTagName("a")
For Each a_tag In alle_a_tags
If teller = 0 Then
GoTo Volgende_a_tag
End If
InnerHTML = a_tag.InnerHTML
InnerText = a_tag.InnerText
Href = a_tag.Href
Date = ...
Next
Based only on HTML provided:
You can match the substring of the href attribute value with starts with operator to get right preceding nodes. You then need to move to the NextSibling to get desired text. You can use Select Case to determine which property to access depending on nodeType of that sibling
Dim i As Long, nodes As Object, nextSibling As Object
Set nodes = ie.document.querySelectorAll("[href^='view_']")
For i = 0 To nodes.Length - 1
Set nextSibling = nodes.Item(i).nextSibling
'https://developer.mozilla.org/en-US/docs/Web/API/Node/nodeType
Select Case nextSibling.NodeType
Case 1
Debug.Print nextSibling.innerText
Case 3
Debug.Print nextSibling.NodeValue
End Select
Next