I have a bytearray extracted from a WPF RichTextControl for which I extract text from. I use following code successfully:
FlowDocument document = new FlowDocument();
TextRange txtRange = null;
using (MemoryStream stream = new MemoryStream(data))
{
txtRange = new TextRange(document.ContentStart, document.ContentEnd);
txtRange.Load(stream, DataFormats.XamlPackage);
}
The problem starts when there is an image embedded in the rtf. I would still like to extract the text but the code above will fail with XamlParseException
on the Load
method.
I tried using following method:
using (RichTextBox rtb = new RichTextbox())
{
rtb.Rtf = System.Text.Encoding.Default.GetString(data);
// use rtb.Text
}
but the setting of rtb.Rtf fails with ArgumentException
. Reason is probably explained here since the GetString
indeed does not return the expected rtf format but mixed text/binary data with mentions of xaml (same format also returns for text only, which was successfully extracted with previous method). I cannot upgrade framework.
I don't mind traversing the FlowDocument tree if needed to extract text if I can find a way to load the document successfully.
Is there an additional way to read the RTF?
Apperantly when an image is included in the RTF, the code will work when running in STA. e.g.:
Thread t = new Thread(() => Foo(data));
t.SetApartmentState(Apartment.STA);
t.Start();
t.Join();
Foo()
{
FlowDocument document = new FlowDocument();
TextRange txtRange = null;
using (MemoryStream stream = new MemoryStream(data))
{
txtRange = new TextRange(document.ContentStart, document.ContentEnd);
txtRange.Load(stream, DataFormats.XamlPackage);
}
}