Search code examples
htmlencodinguwprichtextboxrtf

Is there a way to force the UWP RichEditBox use only UTF encoding when the user types?


I am trying to convert the contents of a UWP RichEditBox to HTML.

For that purpose, I've tried using the RtfPipe library (https://github.com/erdomke/RtfPipe). From the looks of it, this library has a problem on UWP, due to the fact that not all encodings are defined on that target framework. (This is the error you get, if you are interested: Encoding.GetEncoding can't work in UWP app, but the accepted answer seems not to be the best option on all platforms - I haven't even managed to make the suggested fix compile, so it might not be valid anymore)

Now, as a way of avoiding this from happening, I am wondering whether there is a way to force the control to always use one of the UWP-defined UTF-variants for encoding the data when the user types his text. Because, now, when I type into it, I get things like that:

{\rtf1\fbidis\ansi\ansicpg1253\deff0\nouicompat\deflang1032{
....
\pard\tx720\cf1\f0\fs23\lang1033

...that make the library throw exceptions.

I guess, if I manage to make it not use ASCII code pages, things will be great. After taking a look at the control properties though, I do not see something I could use. Is there any way to achieve this?


Solution

  • This is the error you get, if you are interested: Encoding.GetEncoding can't work in UWP app

    As you described, there is an inner error thrown when using this package with UWP app. System.ArgumentException: 'Windows-1252' is not a supported encoding name, by testing on my side, which is thrown by the code line public static readonly Encoding AnsiEncoding = Encoding.GetEncoding("Windows-1252"); of RtfSpec.cs when UpdateEncoding.

    It seems like Windows-1252 may not be supported in UWP from the error details,also see this similar thread. You could use UTF instead as you want, for example, have a change on the library with following then it will work (testing demo here).

    public static readonly Encoding AnsiEncoding = Encoding.UTF8;
    

    I haven't even managed to make the suggested fix compile, so it might not be valid anymore

    Encoding.RegisterProvider method should be work, but it only support UWP or .NET Framework 4.6, it does't support the Portable Class Library. The RtfPipe library you mentioned is Portable Class Library, so that you cannot use Encoding.RegisterProvider. Encoding.GetEncoding method supports Portable Class Library, details please check the version information of the two classed.

    I guess, if I manage to make it not use ASCII code pages

    RTF itself uses the ANSI, PC-8, Macintosh, or IBM PC character set to control the representation and formatting of a document, you may not able to change that. Consider to update the library to resolve the issue for UWP.