I have a folder with inside an XML file like this:
<?xml version="1.0" encoding="UTF-8"?>
<name>Santa Bárbara</name>
<name>Santa Bárbara</name>
<!-- RUID: [UmFuZG9tSVYkc2RlIyh9YcxtmfhRwqry58sgWYNIgEV1AjdsVswrKUorBoUlR6ylFgiaj5XJ0w0DP0lL/htWqOKtE33w1EhBbLABKokIfEo=] -->
The file looks well formatted, in utf8, it contains Russian terms and symbols like "á" in Santa Bárbara.
I should read this file and create a record in a MySql DB (through C#), but I'm facing encoding problems.
PS: the DB table has a few columns (to store city id, country and city translations), all text fields, utf8_general_ci.
I'm trying the following code to read the files (just one in this case) in a folder
foreach (string file in Directory.EnumerateFiles("C:\xml_folder\"" + sub_folder, "*.xml")) {
string response = File.ReadAllText(file, Encoding.GetEncoding("Windows-1252"));
var document = XDocument.Parse(response);
foreach (var child in document.Root.Elements("result")) {
//... code here
String name_it = "";
String name_en = "";
String name_es = "";
String name_fr = "";
String name_de = "";
String name_ru = "";
foreach (var translationsChild in child.Elements("translations"))
switch (translationsChild.Element("language").Value)
case "it":
bytes = Encoding.Default.GetBytes(translationsChild.Element("name").Value);
name_it = Encoding.UTF8.GetString(bytes);
case "en-gb":
bytes = Encoding.Default.GetBytes(translationsChild.Element("name").Value);
name_en = Encoding.UTF8.GetString(bytes);
case "es":
bytes = Encoding.Default.GetBytes(translationsChild.Element("name").Value);
name_es = Encoding.UTF8.GetString(bytes);
case "fr":
bytes = Encoding.Default.GetBytes(translationsChild.Element("name").Value);
name_fr = Encoding.UTF8.GetString(bytes);
case "de":
bytes = Encoding.Default.GetBytes(translationsChild.Element("name").Value);
name_de = Encoding.UTF8.GetString(bytes);
case "ru":
bytes = Encoding.Default.GetBytes(translationsChild.Element("name").Value);
name_ru = Encoding.UTF8.GetString(bytes);
In a few words, I get the file, than I convert it in XML to read all children and save it into the DB.
The problem seems related to the way (encoding) I'm getting the string from the file, I tried conversion in Windows-1252.
string response = File.ReadAllText(file, Encoding.GetEncoding("Windows-1252"));
I even tried conversion in utf8
string response = File.ReadAllText(file, System.Text.Encoding.UTF8);
but every time I get (in the debug console and in the DB), this:
Santa Bárbara -\> Santa B?rbara
Санта-Барбара -\> ?????-??????
It looks like a problem related to the way File.ReadAllText(...)
works, encoding is not working at all.
PS: to store data into the DB I use a DML like this:
cmd.CommandText = "INSERT INTO cities (city_id,country,name,nr_hotels,name_it,name_en,name_es,name_fr,name_de,name_ru,last_modified_date) VALUES(@city_id,@country,@name,@nr_hotels,@name_it,@name_en,@name_es,@name_fr,@name_de,@name_ru,@last_modified_date) on duplicate key update city_id=@city_id,country=@country,name=@name,nr_hotels=@nr_hotels,name_it=@name_it,name_en=@name_en,name_es=@name_es,name_fr=@name_fr,name_de=@name_de,name_ru=@name_ru,last_modified_date=@last_modified_date";
Please, can you help me?
thanks in advance
I don't see any sense in converting to a byte array and back. This works properly for me
string response = File.ReadAllText(file, Encoding.UTF8);
var document = XDocument.Parse(response);
foreach (var child in document.Root.Elements("result"))
//... code here
String name_en = "";
String name_ru = "";
foreach (var translationsChild in child.Elements("translations"))
var name = translationsChild.Element("name").Value;
switch (translationsChild.Element("language").Value)
case "en-gb":
name_en = name;
case "ru":
name_ru = name;
Santa Bárbara