Search code examples
javascripthtmlwpfwebgetelementbyid

How to get the value of a specific HTML element in a given HTML document or webpage (by URL)?


I want to pass the url of a webpage containing a <span id="spanID"> value </span> tag to a method like setTextBoxText(string url, string id) which is written in a wpf application codeBehind (MainWindow.xaml.cs) and set the Text of a specific TextBox Control to the span value, without loading the webpage. (for Ex. tracking price of a product in amazon)

I prefer to execute JavaScript code to get value of html elements and set the content of wpf controls to the result of the js code (function)

something like this:

public partial class MainWindow : Window
{
    string url = "https://websiteaddress.com/rest";
    setTextBoxText(url, "spanID");

    static void setTextBoxText(string url, string id)
    {
        // code to get document by given url
        txtPrice.Text = getHtmlElementValue(id);
    }

    string getHtmlElementValue(string id)
    {
        // what code should be written here?
        // any combination of js and c#?
        // var result = document.getElementById(id).textContent;
        // return result;
    }
}

Solution

  • You can use the HttpClient to load the HTML content of an URL and then process the DOM object in a JavaScript like syntax by wrapping the response into a mshtml.HTMLDocument - requires reference to Microsoft.mshtml.dll:

    private mshtml.HTMLDocument HtmlDocument { get; set; }
    
    private async Task SetTextBoxTextAsync(string url, string id)
    {
      await UpdateHtmlDocumentAsync(url);
      var value = GetHtmlElementValueById(id);
      txtPrice.Text = value;
    }
    
    public async Task UpdateHtmlDocumentAsync(string url)
    {
      using (HttpClient httpClient = new HttpClient())
      {
        byte[] response = await httpClient.GetByteArrayAsync(url);
        string httpResponseText = Encoding.GetEncoding("utf-8").GetString(response, 0, response.Length - 1);
        string htmlContent = WebUtility.HtmlDecode(httpResponseText);
    
        this.HtmlDocument = new HTMLDocument();
        (this.HtmlDocument as IHTMLDocument2).write(htmlContent);
      }
    }
    
    public string GetHtmlElementValueById(string elementId) 
      => this.HtmlDocument.getElementById(elementId).innerText;