I am working in C# .net Core.
Which library/nuget package can I use in C# to extract my data?
I want:
extractedData = xpathLib.Extract(htmlContent, xpath)
I do not want to use a technique which load a html browser process (like selenium driver opening chrome) since I have to extract 10 000 of webpages per day.
regards. ps: i have seen microsoft provides xpath lib, but it targets only xml.
You can use HTML Agility Pack
This nuget works with XPATH, XDocument and LINQ. And easy to use.
Here is an example from HTML Agility Pack:
var url = "http://html-agility-pack.net/";
var web = new HtmlWeb();
var doc = web.Load(url);
var value = doc.DocumentNode.SelectNodes("//td/input");