Search code examples
openxmlopenxml-sdkwordprocessingml

OpenXML Powertools HtmlConverter Fails when Document created with OpenXML SDK


I have wrote a Word document using OpenXML SDK 2.5, this documents give anticipated looks and formatting when I preview it in MS Office.

Now I need to convert this document as HTML document, I came to know about HtmlConverter in OpenXML Powertools, and attempted to use it, Docx to Html conversion using OpenXML power tools failed with NullReferenceException stating Part as null values parameter.

For investigation I've created new Word document (in MS Word) with exact content of my document; This files get converted to Html successfully, So the problem is with the document that I created in C#. I found file size for both varies (Document created from MS Word is heavier, and created with OpenXML SDK seems to be lighter); I've renamed both files as ZIP so as to its check contents, the document.xml markup of both captured below document created with MS office is at Top, and Document markup created with OpenXML SDK is at bottom,

enter image description here

I suspect that the failure of HtmlConverter is due to these markup changes. Is my assumption correct? if so How to add those additional markups in document. here's the code I used to create Word file.

using (WordprocessingDocument wordDocument = WordprocessingDocument.Create(@"D:\15052018.docx", WordprocessingDocumentType.Document)) 
{
    MainDocumentPart mainPart = wordDocument.AddMainDocumentPart();
    mainPart.Document = new Document();
    Body body = mainPart.Document.AppendChild(new Body());
    Paragraph para = body.AppendChild(new Paragraph());
    Run run = para.AppendChild(new Run());
    RunProperties rpr = new RunProperties(new RunFonts() { Ascii = "Times New Roman" });
    run.PrependChild<RunProperties>(rpr);
    run.AppendChild(new Text("Welcome"));
    wordDocument.Save();
    wordDocument.Close();
}

For Html Conversion,

using (WordprocessingDocument doc = WordprocessingDocument.Open(@"D:\15052018.docx", true))
{
    HtmlConverterSettings settings = new HtmlConverterSettings() { PageTitle = "My Page Title" };
    var html = HtmlConverter.ConvertToHtml(wDoc: doc, htmlConverterSettings: settings);
    File.WriteAllText(@"D:\Test1.html", html.ToStringNewLineOnAttributes());
}

Solution

  • To see the file differences, I would suggest you compare the file you created with the SDK against the file you created with Word. You can do this with the Open XML Productivity Tool. To install the tool, follow these steps:

    1. Go to the download link
    2. Click the Red Download button.
    3. On the next screen, just click the box next to OpenXMLSDKToolV25.msi
    4. Then click next and the download will start in your browser automatically.

    Once installed, launch the tool.

    To compare 2 OpenXml files, click the Compare Files button in the middle and the difference will be shown.

    enter image description here

    Once your files are open in compare mode, select the main Document part in the right hand side part selector area and click the "View Part Diff" button.

    enter image description here

    This will show you the XML that is different. If you click, View Package code this generates C# code that can make up difference between the two files if you need it.

    Pro tip - to generate just the code needed to build your file that was created by Word, open it in the Productivity Tool in non compare mode by using the Open File button. Then click Reflect code to generate the C# code needed to recreate an exact clone of your Word generated file.