Search code examples
c#ms-officeoffice-interoppublisher

Get different style sections in Microsoft Publisher via Interop


I have a little C# app that is extracting text from a Microsoft Publisher file via the COM Interop API. This works fine, but I'm struggling if I have multiple styles in one section. Potentially every character in a word could have a different font, format, etc.
Do I really have to compare character after character? Or is there something that returns me the different style sections? Kinda like I can get the different Paragraphs?

foreach (Microsoft.Office.Interop.Publisher.Shape shp in pg.Shapes)
{
    if (shp.HasTextFrame == MsoTriState.msoTrue)
    {
        text.Append(shp.TextFrame.TextRange.Text);

        for(int i = 0; i< shp.TextFrame.TextRange.WordsCount; i++)
        {
            TextRange range = shp.TextFrame.TextRange.Words(i+1, 1);
            string test = range.Text;
        }
    }
}

Or is there in general a better way to extract the text from a Publisher file? But I have to be able to actually write it back with the same formatting. It's for a translation.


Solution

  • We tried an approach were we just compared for every character as many font styles as possible. Not pretty, but works in most cases...