I am using VBA for web scraping. Below is the html structure and my VBA code.
When I run it I am getting this text ETA : 2020-08-26 (Reference only, the date will be updated according to shipments).
But I want to scrape only the date from it 2020-08-26
<div style="font-size: 14px;">
<span class="label" style="font-weight: bolder; font-size: 13px;">ETA : </span>
<br>
2020-08-26
<span style="color: red; font-size: 12px;">(Reference only, the date will be updated according to
shipments).</span>
</div>
VBA Code>
Dim ie As New InternetExplorer
Dim doc As New HTMLDocument
ie.navigate "http://127.0.0.1/wordpress/sample-page/"
Do
DoEvents
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
Set elems = doc.getElementsByTagName("div")
MsgBox elems(33).innerText
This code finds any date of the form ####-##-##
.
Cells.Clear
s = "ETA : 2020-08-26 (Reference only, the date will be updated according to shipments)."
ReDim a(1 To Len(s))
For i = 1 To Len(s)
a(i) = IIf(Mid(s, i, 1) Like "#", "#", Mid(s, i, 1))
Next i
fd = "####-##-##"
Cells(1, 1) = s
aa = Join(a, "")
Cells(2, 1) = aa
Cells(3, 1) = Mid(s, InStr(aa, fd), Len(fd))
Cells(3, 1).NumberFormat = "yyyy-mm-dd"
First it splits the string into an array, and replaces all digits with a #
. Then it uses InStr
to find a match to the pattern template fd
, and uses the return value from the match to return the actual date.