Search code examples
vb.netweb-scrapingyahoo-finance

VB: Pull text from specific HTML tag into Textbox


I am trying to write a VB.Net application that needs to pull the text from a specific tag:

<span data-reactid="85">172,890,000</span>

and then enter the text found 172,890,000 into a textbox on the form.

In Textbox1, you enter the Stock Symbol you want to search.

The data for "TTM - Total Revenue" will always be held within the:

<span data-reactid="85">172,890,000</span> tag. Regardless of stock you check.

In RichTextBox1, is the downloaded source code for the url.

TextBox2 is where it pulls "TTM". I will probably change it to a label as it's a constant value. I cant put the number in the variable as it will vary on the company, i.e. the value entered into TextBox1.

TextBox3 is going to show the value I really need. The 172,890,000 held in the

<span data-reactid="85">172,890,000</span> tag.

I was wondering how to search for the string within RichTextBox1, and pull the next 7 characters after the end of the string if that would work?

My code so far is:

Public Class Form1

    Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click

        Dim Source As String
        Dim ttm1 As String
        Dim ttmrev As String

        Source = New System.Net.WebClient().DownloadString("https://finance.yahoo.com/quote/" + TextBox1.Text + "/financials?p=" + TextBox1.Text)
        ttm1 = <span data-reactid="65">ttm</span>
        ttmrev = <span data-reactid="85"></span>

        RichTextBox1.Text = Source


        If RichTextBox1.Find(ttm1) Then

            TextBox2.Text = "ttm".ToUpper

        End If

    End Sub

End Class

Solution

  • As usual in coding, there are always many ways to accomplish the same end result. One way is to use a Regex.

    For example:

    Imports System.Text.RegularExpressions
    
    Sub Main
    
        ' In place of a static string here, place the exact line you want to search instead.
        Dim str As String = "<span data-reactid=""85"">172,890,000</span>"
    
        ' Search for the match
        showMatch(str, "(\d+),?(\d+),?(\d+)")
    
    End Sub
    
    Sub showMatch(ByVal text As String, ByVal expr As String)
        
        ' Make sure you're expression looks the way you want it
        Debug.WriteLine("The Expression: " + expr)
        
        ' Declare the variable and do the regex match
        Dim mc As MatchCollection = Regex.Matches(text, expr)
        Dim m As Match
    
        ' Handle the result however you wish; for example instead of a loop, since 
        ' there should only be 1 result you could just do
        ' something like: TextBox3.Text = m.ToString()
        For Each m In mc
            Debug.WriteLine(m)
        Next m
        
    End Sub
    

    The above regex will find any number that matches for example:

    172,890,000
    890,000
    000
    2,80,000
    
    etc.
    

    Another choice might be to use a WebBrowser control, fetch the source into it, and play around with the innerHtml as needed.