I want to get title of a website. I'm using client.Encoding, It's almost perfect but there's something wrong.
It returns me "Budapeşte'de gezilecek yerler | Skyscanner Haberler" but the title has apostrophe instead of unicode.
The Turkish character "ş" is OK.
string baslikCek()
{
Uri url = new Uri("https://www.skyscanner.com.tr/haberler/budapestede-gezilecek-yerler");
WebClient client = new WebClient();
client.Encoding = Encoding.UTF8;
string html = client.DownloadString(url);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
String title = (from x in doc.DocumentNode.Descendants()
where x.Name.ToLower() == "title"
select x.InnerText).FirstOrDefault().ToString();
return title;
}
Your example shown here is incorrect, '
is missing the trailing ;
.
But it is correct from the server, so you may do this:
return System.Net.WebUtility.HtmlDecode(title);
This is not the same as Encoding.UTF8
, which is the binary encoding of the string data.