I am parsing HTML table into text file and the below is my code sample. In the cols6
or the 6th <td></td>
, the innertext is e.g. 70,430
. I couldn't work it out on how to ignore the comma when writing the innertext to text file. I would like it to write only 70430
instead of 70,430
. May I know what shall I do to cols6[j].InnerText
in order to get rid of the ,
in the numbers? Any help would be much appreciated. Thank you! :)
// Load HTML
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(fileName);
// Get all tables in the document
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
using (FileStream fs = new FileStream(@"..\..\bin\Debug\Pages\" + "Director.txt", FileMode.Append))
using (StreamWriter sw = new StreamWriter(fs))
{
// Iterate all rows in the relevant table
HtmlNodeCollection rows = tables[2].SelectNodes(".//tr[position() >2]");
for (int i = 0; i < rows.Count; ++i)
{
// Iterate all columns in this row
HtmlNodeCollection cols = rows[i].SelectNodes(".//td[1]");
HtmlNodeCollection cols2 = rows[i].SelectNodes(".//td[2]");
HtmlNodeCollection cols3 = rows[i].SelectNodes(".//td[3]");
HtmlNodeCollection cols4 = rows[i].SelectNodes(".//td[4]");
HtmlNodeCollection cols5 = rows[i].SelectNodes(".//td[5]");
HtmlNodeCollection cols6 = rows[i].SelectNodes(".//td[6]");
HtmlNodeCollection cols7 = rows[i].SelectNodes(".//td[7]");
for (int j = 0; j < cols.Count; ++j)
// Get the value of the column and print it
sw.WriteLine(cols[j].InnerText + "," + cols2[j].InnerText + "," + cols3[j].InnerText + "," +
cols4[j].InnerText + "," + cols5[j].InnerText + "," + cols6[j].InnerText + "," + cols7[j].InnerText + ",822");
}
sw.Flush();
sw.Close();
fs.Close();
}
You can Replace() the comma.
cols6[j].InnerText = cols6[j].InnerText.Replace(",", "");
For the WriteLine() you could also go like this:
sw.WriteLine(cols[j].InnerText + "," + cols2[j].InnerText + "," + cols3[j].InnerText + "," +
cols4[j].InnerText + "," + cols5[j].InnerText + "," + cols6[j].InnerText.Replace(",", "") + "," + cols7[j].InnerText + ",822");