Hi I have Resume in the html format, I am reading file using StreamReader ,and I am removing tags using below method.
using (StreamReader sr = new StreamReader("\\Myfile.html"))
{
String line = sr.ReadToEnd();
string jj = Regex.Replace(line, "<.*?>", String.Empty);
}
Its working Damn Cool
But however as per my requirement I need the data only inside the body tag. but no body tag, and with no tags inside.
Don't use Regex for HTML/XML parsing. Use Html/Xml parser. Here is explain well why you should not use it.
RegEx match open tags except XHTML self-contained tags
Can you provide some examples of why it is hard to parse XML and HTML with a regex?
You can load the string in Html document using HTML Agility pack
Here little example of how to do it:
public string ReplacePElement()
{
HtmlDocument doc = new HtmlDocument();
doc.Load(htmlFile);
foreach(HtmlNode p in doc.DocumentNode.SelectNodes("body"))
{
}
return doc.DocumentNode.OuterHtml;
}