This is ASP.Net Core 7.0 and Open XML SDK 2.19.0.
I'm cloning an existing template Word document from disk to a new file and then insert HTML at a specific place indicated by placeholder text, using an AltChunk. It doesn't matter how simple the content in the AltChunk is, the document is always reported as corrupted when I try to open it in Word.
string rootPath = _environment.WebRootPath;
string filePath = Path.Combine(rootPath, "files", "quotes", $"{DetailedQuote.Quote.QuoteID}.docx");
// Open the original template document and clone it to the new path as editable
DetailedTemplate.templateDocument = (WordprocessingDocument) _document.ReadWordDoc(quote.Template.TemplateID, "template").Clone(filePath, true);
// Insert content from service documents
var mainPart = DetailedTemplate.templateDocument.MainDocumentPart;
var paragraphs = mainPart.Document.Body.Descendants<Paragraph>();
foreach (var paragraph in paragraphs)
{
if (paragraph.InnerText == "[##(SERVICE_DETAILS)##]")
{
string serviceDescriptionHTML = "Hello";
var chunkID = 0;
foreach (var service in DetailedQuote.Quote.QuoteServices)
{
string sChunkID = $"myhtmlID{chunkID++}";
AlternativeFormatImportPart oChunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.Html, sChunkID);
using(MemoryStream memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(serviceDescriptionHTML)))
{
oChunk.FeedData(memoryStream);
}
AltChunk oAltChunk = new AltChunk();
oAltChunk.Id =sChunkID ;
// Add the chunk to the paragraph
paragraph.Parent.InsertAfter(oAltChunk, paragraph);
}
}
}
// Save changes to the main document
mainPart.Document.Save();
// Close the document so that we can read it from disk
DetailedTemplate.templateDocument.Close();
// Return the content of the main document as a FileResult
byte[] fileBytes = System.IO.File.ReadAllBytes(filePath);
return File(fileBytes, "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "MyDocument.docx");
The OpenXmlValidator I run after the document is created (not included in my example) doesn't report any errors and neither does the Open XML SDK 2.5 Productivity Tool.
If I instead simply update the text of the paragraph, then the document opens without errors in Word.
...
foreach (var service in DetailedQuote.Quote.QuoteServices)
{
var text = paragraph.Descendants<Text>().FirstOrDefault();
if (text != null)
{
text.Text = "This is text!";
}
}
...
This to me can only mean that adding the AltChunk is messing something up but as far as I understand, adding an AltChunk is the correct way to add HTML to a Word document.
I've spent two days reading pretty much everything I can find on the topic, I've asked every bot out there to help me find the issue, I've tried Open-Xml-PowerTools but can't find any good documentation, I've tried HtmlToOpenXml but got versioning issues, and I've opened the .docx file to dig through it manually but have so far not been able to resolve this.
Any and all help is greatly appreciated!
[Edit]
If I allow Word to try and open the generated document the contents are present and looking as expected. If I then save the "recovered" document as a new file, this document will also be flagged as corrupted if I open it with Word again.
The problem occurs because Microsoft Word is unable to parse "Hello" as HTML, yea... I know...
Anyway, try using this:
string serviceDescriptionHTML = "<html>Hello</html>";
Or this:
string serviceDescriptionHTML = "<body>Hello</body>";
Or this:
string serviceDescriptionHTML = "<!DOCTYPE html>Hello";