Search code examples
c#rtf

Cannot write to rtf file after replacing inside string with utf8 characters


I have a rtf file in which I have to make some text replacements with some language specific characters (UTF8). After the replacements I try to save to a new rtf file but either the characters are not set right(strange characters) or the file is saved with all the rtf raw code and all the formatting. Here is my code:

var fs = new FileStream(@"F:\projects\projects\RtfEditor\Test.rtf", FileMode.Open, FileAccess.Read);
//reads the file in a byte[]
var sb = FileWorker.ReadToEnd(fs);
var enc = Encoding.GetEncoding(1250);
//var enc = Encoding.UTF8;
var sbs = enc.GetString(sb);
var sbsNew = sbs.Replace("#test/#", "ă î â șșțț");
//first writting aproach
var fsw = new FileStream(@"F:\projects\projects\RtfEditor\diac.rtf", FileMode.Create, FileAccess.Write);                                     
fsw.Write(enc.GetBytes(sbsNew), 0, enc.GetBytes(sbsNew).Length);
fsw.Flush();
fsw.Close();

In this aproach, the result file is the right one but the characters "șșțț" are shown as "????".

//second writing aproach
using (StreamWriter sw = new StreamWriter(fsw, Encoding.UTF8))
{
    sw.Write(sbsNew);
    sw.Flush();
}

In this aproach, the result file is a rtf file but with all rtf raw code and formatting and the special characters are saved right (șșțț appear correcty, no more ????)


Solution

  • A RTF file can directly contain 7-bit characters only. Everything else needs to be encoded into escape sequences. More detailed information can be found in e.g. this Wikipedia article.