Search code examples
c#.net-6.0openxmlcpu-wordmemorystream

DocumentFormat.OpenXml: Is it possible to save word document to Memory Stream?


I am working on a function that allows me to create a simple word document from a string. I am using DocumentFormat.OpenXml Version="2.20.0" to create the word document. I don't understand why I can't save my word document in a memory stream whereas I can save the word document in a file.

  public Task<byte[]> ConvertToWordAsync(string text)
    {
        if (text.IsNullOrEmpty())
            return Task.FromResult(Array.Empty<byte>());

        using var memoryStream = new MemoryStream();
        using var wordDocument = WordprocessingDocument.Create(memoryStream, WordprocessingDocumentType.Document);
        
        MainDocumentPart mainPart = wordDocument.AddMainDocumentPart();
        mainPart.Document = new Document
        {
            Body = new Body()
        };
        Body body = mainPart.Document.Body;
        Paragraph paragraph = new Paragraph();
        Run run = new Run();
        Text bodyText = new Text(text);
        run.Append(bodyText);
        paragraph.Append(run);
        body.Append(paragraph);

        wordDocument.Save();
        
        return Task.FromResult(memoryStream.ToArray());
    }

When I call this function, the memory stream is always empty. If i change

using var wordDocument = WordprocessingDocument.Create(memoryStream, WordprocessingDocumentType.Document);

To

using var wordDocument = WordprocessingDocument.Create("C:\\Workspace\\65.docx", WordprocessingDocumentType.Document);

I am able to open the word file.

I don't understand why I can't save the same word file to a memory stream. Do you have any idea about the solution of this problem ?


Solution

  • tl;dr Don't let the compiler guess where the using block ends when you rely on the Dispose call.

    A using var Foo = new Foo(); Foo.Whatever(); still generates this code when compiled (irrelevant details omitted, see: What are the uses of "using" in C#?) :

       var Foo = new Foo();
       try
       {
          Foo.Whatever();
       }
       finally
       {
          Foo.Dispose();
       }
    

    The Dispose call is relevant here.

    In this (details omitted) code:

    public Task<byte[]> ConvertToWordAsync(string text)
    {
        using var memoryStream = new MemoryStream();
        using var wordDocument = WordprocessingDocument.Create(memoryStream, WordprocessingDocumentType.Document);
        
        // details omitted for brevity
    
        wordDocument.Save();
        
        return Task.FromResult(memoryStream.ToArray());
    }
    

    the compiler generated:

    public Task<byte[]> ConvertToWordAsync(string text)
    {
        var memoryStream = new MemoryStream();
        try 
        {
           var wordDocument = WordprocessingDocument.Create(memoryStream, WordprocessingDocumentType.Document);
           try
           {
              // details omitted for brevity
    
               wordDocument.Save();
        
               return Task.FromResult(memoryStream.ToArray());
           }
           finally
           {
              wordDocument.Dispose();
           } 
        }
        finally
        {
           memoryStream.Dispose();
        }
    }
    

    The problem here is that WordprocessingDocument needs to write to its Stream a complete Zip archive with multiple files to create a valid OpenXml file container. It will only do so when no more calls to Save() are expected. That is either when Close() gets called or its Dispose() method gets invoked.
    Before that the stream is incomplete at best.

    Due to where the compiler emitted the finally blocks with the calls to the Dispose methods, the memoryStream wasn't even close to be complete when ToArray() was called. It was when your method returned but no code was left to capture that data.

    You say you solved the issue by explicitly calling Dispose. That works. Or fallback to sane syntax without surprises:

    using var memoryStream = new MemoryStream();
    using(var wordDocument = WordprocessingDocument.Create(memoryStream, WordprocessingDocumentType.Document))
    {
    
      // details omitted for brevity
    
      wordDocument.Save();
    } // wordDocument.Dispose() called here
    return Task.FromResult(memoryStream.ToArray());