I want to retrieve the href attribute of the <h3>
tags of an html page, I am able to get the innerText, but I don't know how to access the href attribute, there are several <h3>
tags in the document, but for the time being I just need the first one. I will deal with the rest later...
This is the code I got so far
Sub Scrap()
Dim IE As New InternetExplorer
Dim sDD As String
Dim Doc As HTMLDocument
IE.Visible = True
IE.navigate "https://www.oneoiljobsearch.com/senior-reservoir-engineer-jobs/?page=1"
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
sDD = Trim(Doc.getElementsByTagName("h3")(0).innerText)
'sDD contains the string "Senior Reservoir Engineer"
End Sub
Below is a portion of the HTML document to extract data from:
<div class="front_job_details">
<h3>
<a href="/jobs/senior-reservoir-engineer-oslo-norway-7?cmp=js&from=job-search-form-2" target="_blank">
Senior Reservoir Engineer
</a>
</h3>
The text I need to retrieve is: "/jobs/senior-reservoir-engineer-oslo-norway-7?cmp=js&from=job-search-form-2"
Thanks in advance for your help.
Try,
dim hr as string
hr = Doc.getElementsByTagName("h3")(0).getElementsByTagName("a")(0).href
debug.print hr
The getElementsByTagName collection is zero-based but the .Length (the # of H3's, called Count in other methods) is one-based.
dim i as long
for i=0 to Doc.getElementsByTagName("h3").length - 1
debug.print Doc.getElementsByTagName("h3")(i).getElementsByTagName("a")(0).href
next i
This gets the first <A> tag from each H3. You could duplicate the method to get multiple A's from each H3.