The app is developed in .NET and reads an RTF document template that contains placeholders that require replacing with text currently stored in a SQL Server database. The app then saves the RTF doc with the substituted text. However, French characters read from the database, such as é are being displayed as é in the RTF document.
The process is:
The key bits of the code I think are...
Read from RTF doc:
StringBuilder buffer;
using (StreamReader input = new StreamReader(pathToTemplate))
{
buffer = new StringBuilder(input.ReadToEnd());
}
Replace placeholder text with text from database:
buffer.Replace("$$placeholder$$", strFrenchCharsFromDb);
Save the edits as a new RTF doc:
byte[] fileBytes = System.Text.Encoding.UTF8.GetBytes(buffer.ToString());
File.WriteAllBytes(pathToNewRtfDoc, fileBytes);
When I debug buffer
during "Save" the é character is present.
When I open the RTF after File.WriteAllBytes
it contains é instead.
I have tried specifying the encoding when creating the StreamReader but it was the same result.
i.e. using (StreamReader input = new StreamReader(pathToTemplate, Encoding.UTF8))
Apply the following method on the strFrenchCharsFromDb
string before caling the Replace()
:
buffer.Replace("$$placeholder$$", ConvertNonAsciiToEscaped(strFrenchCharsFromDb));
The ConvertNonAsciiToEscaped()
method implementation:
/// <param name="rtf">An RTF string that can contain non-ASCII characters and should be converted to correct format before loading to the RichTextBox control.</param>
/// <returns>The source RTF string with converted non ASCII to escaped characters.</returns>
public string ConvertNonAsciiToEscaped(string rtf)
{
var sb = new StringBuilder();
foreach (var c in rtf)
{
if (c <= 0x7f)
sb.Append(c);
else
sb.Append("\\u" + Convert.ToUInt32(c) + "?");
}
return sb.ToString();
}