Search code examples
c#zipopenxmldocxopenxml-sdk

Editing custom XML part in word document sometimes corrupts document


We have a system that stores some custom templating data in a Word document. Sometimes, updating this data causes Word to complain that the document is corrupted. When that happens, if I unzip the docx file and compare the contents to the previous version, the only difference appears to be the expected change in the customXML\item.xml file. If I re-zip the contents using 7zip, it seems to work OK (Word no longer complains that the document is corrupt).

The (simplified) code:

void CreateOrReplaceCustomXml(string filename, MyCustomData data)
{
    using (var doc = WordProcessingDocument.Open(filename, true))
    {
        var part = GetCustomXmlParts(doc).SingleOrDefault();
        if (part == null)
        {
            part = doc.MainDocumentPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
        }

        var serializer = new DataContractSerializer(typeof(MyCustomData));
        using (var stream = new MemoryStream())
        {
            serializer.WriteObject(stream, data);
            stream.Seek(0, SeekOrigin.Begin);
            part.FeedData(stream);
        }
    }
}

IEnumerable<CustomXmlPart> GetCustomXmlParts(WordProcessingDocument doc)
{
    return doc.MainDocumentPart.CustomXmlParts
        .Where(part =>
        {
            using (var stream = doc.Package.GePart(c.Uri).GetStream())
            using (var streamReader = new StreamReader(stream))
            {
                return streamReader.ReadToEnd().Contains("Some.Namespace");
            }
        });
}

Any suggestions?


Solution

  • Since re-zipping works, it seems the content is well-formed.

    So it sounds like the zip process is at fault. So open the corrupted docx in 7-Zip, and take note of the values in the "method" column (especially for customXML\item.xml).

    Compare that value to a working docx - is it the same or different? Method "Deflate" works.