Search code examples
c#asp.netweb-applicationsparser-generator

How to PARSE HTML Files and SUBMIT information programmatically


ASP.NET 4 & C# and

I would like to know which CODE, Classes could be useful for creating a WEB APPLICATION that could:

01 - Connect to an HTML file on the web.
02 - Parse its content (text content).
03 - Find out specific content in a page (for example looking for specific keywords).

Also how to implement:

04 - How to submit information programmatically in HTML page (feeling forms).

I am interested in understanding Classes and general practice and CODE for accomplish this task.

If you have any idea please let me know. Thanks guys once again for your support! :-)


Solution

  • I'm not sure if you want all of the things that you mention to execute 'server-side', but assuming that this is the case:

    01 - Connect to an HTML file on the web.

    Check out the WebClient class, and the HttpWebRequest class for more advanced scenarios.

    02 - Parse its content (text content). 03 - Find out specific content in a page (for example looking for specific keywords).

    You might want to look at the Html Agility Pack, or if Bobince doesn't notice, regular expressions.

    04 - How to submit information programmatically in HTML page (feeling forms).

    Typically, this will require sending a HTTP POST request, which too can be accomplished with the HttpWebRequest class.