Search code examples
openxmlfilestreamopenxml-sdkmemorystream

how to modify content in filestream while using open xml?


In the code below, I am merging some files together and saving them in the test.docx file. However, before I merg each file, I would like to first replace the text of some content controls which are used as place holders. Can someone show me how to do that?

suppose I have one content control in template2 and it is called placeholder1. How can I add text to this placeholder while usig the filestream?

string fileName = Path.Combine(@"Docs\templates", "test.docx");
            for (int i = 1; i < 3; i++)
            {
                string filePath = Path.Combine(@"Docs\templates", "report-Part" + i + ".docx");

                //using (MemoryStream ms = new MemoryStream())
                //{
                //ms.Write(templateFile, 0, templateFile.Length);
                using (WordprocessingDocument myDoc = WordprocessingDocument.Open(fileName, true))
                {
                    MainDocumentPart mainPart = myDoc.MainDocumentPart;
                    string altChunkId = "AltChunkId" + Guid.NewGuid();
                    AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);
                    using (FileStream fileStream = File.Open(filePath, FileMode.Open))
                    {
                        chunk.FeedData(fileStream);
                    }
                    //chunk.FeedData(ms);
                    AltChunk altChunk = new AltChunk();
                    altChunk.Id = altChunkId;

                    Paragraph paragraph2 = new Paragraph() { RsidParagraphAddition = "00BE27E7", RsidRunAdditionDefault = "00BE27E7" };

                    Run run2 = new Run();
                    Break break1 = new Break() { Type = BreakValues.Page };

                    run2.Append(break1);
                    paragraph2.Append(run2);
                    mainPart.Document.Body.Append(paragraph2);
                    var lastParagraph = mainPart.Document.Body.Elements<Paragraph>().Last();
                    mainPart.Document.Body.InsertAfter(altChunk, lastParagraph);
                    mainPart.Document.Save();
                    myDoc.Close();
                }
                //ms.Position = 0;
                ////ms.ToArray();
                //output = new byte[ms.ToArray().Length];
                //ms.Read(output, 0, output.Length);
                //}

Solution

  • The following sample code, which is written as an xUnit unit test, shows how you can achieve what you want to do. I've added code comments to explain what is done and why.

        public class AltChunkAssemblyTests
        {
            // Sample template file names for unit testing purposes.
            private readonly string[] _templateFileNames =
            {
                "report-Part1.docx",
                "report-Part2.docx",
                "report-Part3.docx"
            };
    
            // Sample content maps for unit testing purposes.
            // Each Dictionary<string, string> represents data used to replace the
            // content of block-level w:sdt elements identified by w:tag values of
            // "firstTag" and "secondTag".
            private readonly List<Dictionary<string, string>> _contentMaps = new List<Dictionary<string, string>>
            {
                new Dictionary<string, string>
                {
                    { "firstTag", "report-Part1: First value" },
                    { "secondTag", "report-Part1: Second value" }
                },
                new Dictionary<string, string>
                {
                    { "firstTag", "report-Part2: First value" },
                    { "secondTag", "report-Part2: Second value" }
                },
                new Dictionary<string, string>
                {
                    { "firstTag", "report-Part3: First value" },
                    { "secondTag", "report-Part3: Second value" }
                }
            };
    
            [Fact]
            public void CanAssembleDocumentUsingAltChunks()
            {
                // Create some sample "templates" (technically documents) for unit
                // testing purposes.
                CreateSampleTemplates();
    
                // Create an empty result document.
                using WordprocessingDocument wordDocument = WordprocessingDocument.Create(
                    "AltChunk.docx", WordprocessingDocumentType.Document);
    
                MainDocumentPart mainPart = wordDocument.AddMainDocumentPart();
                var body = new Body();
                mainPart.Document = new Document(body);
    
                // Add one w:altChunk element for each sample template, using the
                // sample content maps for mapping sample data to the content
                // controls contained in the templates.
                for (var index = 0; index < 3; index++)
                {
                    if (index > 0) body.AppendChild(new Paragraph(new Run(new Break { Type = BreakValues.Page })));
                    body.AppendChild(CreateAltChunk(_templateFileNames[index], _contentMaps[index], wordDocument));
                }
            }
    
            private void CreateSampleTemplates()
            {
                // Create a sample template for each sample template file names.
                foreach (string templateFileName in _templateFileNames)
                {
                    CreateSampleTemplate(templateFileName);
                }
            }
    
            private static void CreateSampleTemplate(string templateFileName)
            {
                // Create a new Word document with paragraphs marking the start and
                // end of the template (for testing purposes) and two block-level
                // structured document tags identified by w:tag elements with values
                // "firstTag" and "secondTag" and values that are going to be
                // replaced by the ContentControlWriter during document assembly.
                using WordprocessingDocument wordDocument = WordprocessingDocument.Create(
                    templateFileName, WordprocessingDocumentType.Document);
    
                MainDocumentPart mainPart = wordDocument.AddMainDocumentPart();
                mainPart.Document =
                    new Document(
                        new Body(
                            new Paragraph(
                                new Run(
                                    new Text($"Start of template '{templateFileName}'"))),
                            new SdtBlock(
                                new SdtProperties(
                                    new Tag { Val = "firstTag" }),
                                new SdtContentBlock(
                                    new Paragraph(
                                        new Run(
                                            new Text("First template value"))))),
                            new SdtBlock(
                                new SdtProperties(
                                    new Tag { Val = "secondTag" }),
                                new SdtContentBlock(
                                    new Paragraph(
                                        new Run(
                                            new Text("Second template value"))))),
                            new Paragraph(
                                new Run(
                                    new Text($"End of template '{templateFileName}'")))));
            }
    
            private static AltChunk CreateAltChunk(
                string templateFileName,
                Dictionary<string, string> contentMap,
                WordprocessingDocument wordDocument)
            {
                // Copy the template file contents to a MemoryStream to be able to
                // update the content controls without altering the template file.
                using FileStream fileStream = File.Open(templateFileName, FileMode.Open);
                using var memoryStream = new MemoryStream();
                fileStream.CopyTo(memoryStream);
    
                // Open the copy of the template on the MemoryStream, update the
                // content controls, save the updated template back to the
                // MemoryStream, and reset the position within the MemoryStream.
                using (WordprocessingDocument chunkDocument = WordprocessingDocument.Open(memoryStream, true))
                {
                    var contentControlWriter = new ContentControlWriter(contentMap);
                    contentControlWriter.WriteContentControls(chunkDocument);
                }
    
                memoryStream.Seek(0, SeekOrigin.Begin);
    
                // Create an AlternativeFormatImportPart from the MemoryStream.
                string altChunkId = "AltChunkId" + Guid.NewGuid();
                AlternativeFormatImportPart chunk = wordDocument.MainDocumentPart.AddAlternativeFormatImportPart(
                    AlternativeFormatImportPartType.WordprocessingML, altChunkId);
    
                chunk.FeedData(memoryStream);
    
                // Return the w:altChunk element to be added to the w:body element.
                return new AltChunk { Id = altChunkId };
            }
        }
    
    

    I've tested the code, using the ContentControlWriter class I created to answer your other question on how to create a new document from word template with multiple pages using documentformat.openxml. It works nicely. The complete code can be found in my CodeSnippets GitHub repository. Look for AltChunkAssemblyTests and ContentControlWriter.

    The CreateSampleTemplates() method creates three sample documents. For example, the main document part of report-Part1.docx has the following contents:

    <?xml version="1.0" encoding="utf-8"?>
    <w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
      <w:body>
        <w:p>
          <w:r>
            <w:t>Start of template 'report-Part1.docx'</w:t>
          </w:r>
        </w:p>
        <w:sdt>
          <w:sdtPr>
            <w:tag w:val="firstTag" />
          </w:sdtPr>
          <w:sdtContent>
            <w:p>
              <w:r>
                <w:t>First template value</w:t>
              </w:r>
            </w:p>
          </w:sdtContent>
        </w:sdt>
        <w:sdt>
          <w:sdtPr>
            <w:tag w:val="secondTag" />
          </w:sdtPr>
          <w:sdtContent>
            <w:p>
              <w:r>
                <w:t>Second template value</w:t>
              </w:r>
            </w:p>
          </w:sdtContent>
        </w:sdt>
        <w:p>
          <w:r>
            <w:t>End of template 'report-Part1.docx'</w:t>
          </w:r>
        </w:p>
      </w:body>
    </w:document>
    

    After assembly and without having Word save the document again, the main document part of AltChunk.docx looks like this:

    <?xml version="1.0" encoding="utf-8"?>
    <w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
      <w:body>
        <w:altChunk r:id="AltChunkId81885280-e38d-4ffb-b8a3-38d96992c2eb" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" />
        <w:p>
          <w:r>
            <w:br w:type="page" />
          </w:r>
        </w:p>
        <w:altChunk r:id="AltChunkId6d862de7-c477-42bc-baa4-c42441e5b03b" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" />
        <w:p>
          <w:r>
            <w:br w:type="page" />
          </w:r>
        </w:p>
        <w:altChunk r:id="AltChunkIdbfd7ea64-4cd0-4acf-9d6f-f3d405c021ca" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" />
      </w:body>
    </w:document>
    

    I'm not sure why exactly you are using those w:altChunk elements and related parts to combine several Word documents. This requires Microsoft Word to do the "heavy lifting", although in your case it might be very easy to produce the correct markup directly. For example, as soon as you save the document in Microsoft Word, the main document part looks as follows (with additional XML namespaces, which I removed for clarity):

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <w:document xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" 
                xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
                xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml"
                mc:Ignorable="w14">
      <w:body>
        <w:p w14:paraId="76D6BC46" w14:textId="77777777" w:rsidR="00EA51EB" w:rsidRDefault="00B25FEF">
          <w:r>
            <w:t>Start of template 'report-Part1.docx'</w:t>
          </w:r>
        </w:p>
        <w:sdt>
          <w:sdtPr>
            <w:tag w:val="firstTag"/>
            <w:id w:val="-1950995891"/>
          </w:sdtPr>
          <w:sdtEndPr/>
          <w:sdtContent>
            <w:p w14:paraId="2701CE6E" w14:textId="77777777" w:rsidR="00EA51EB" w:rsidRDefault="00B25FEF">
              <w:r>
                <w:t>report-Part1: First value</w:t>
              </w:r>
            </w:p>
          </w:sdtContent>
        </w:sdt>
        <w:sdt>
          <w:sdtPr>
            <w:tag w:val="secondTag"/>
            <w:id w:val="551584029"/>
          </w:sdtPr>
          <w:sdtEndPr/>
          <w:sdtContent>
            <w:p w14:paraId="0B591553" w14:textId="77777777" w:rsidR="00EA51EB" w:rsidRDefault="00B25FEF">
              <w:r>
                <w:t>report-Part1: Second value</w:t>
              </w:r>
            </w:p>
          </w:sdtContent>
        </w:sdt>
        <w:p w14:paraId="7393CFF0" w14:textId="77777777" w:rsidR="00E60EE9" w:rsidRDefault="00B25FEF">
          <w:r>
            <w:t>End of template 'report-Part1.docx'</w:t>
          </w:r>
        </w:p>
        <w:p w14:paraId="089D32A3" w14:textId="77777777" w:rsidR="00E60EE9" w:rsidRDefault="00B25FEF">
          <w:r>
            <w:br w:type="page"/>
          </w:r>
        </w:p>
        <w:p w14:paraId="11AC41DA" w14:textId="77777777" w:rsidR="00716CCA" w:rsidRDefault="00B25FEF">
          <w:r>
            <w:lastRenderedPageBreak/>
            <w:t>Start of template 'report-Part2.docx'</w:t>
          </w:r>
        </w:p>
        <w:sdt>
          <w:sdtPr>
            <w:tag w:val="firstTag"/>
            <w:id w:val="-1559003811"/>
          </w:sdtPr>
          <w:sdtEndPr/>
          <w:sdtContent>
            <w:p w14:paraId="1867093C" w14:textId="77777777" w:rsidR="00716CCA" w:rsidRDefault="00B25FEF">
              <w:r>
                <w:t>report-Part2: First value</w:t>
              </w:r>
            </w:p>
          </w:sdtContent>
        </w:sdt>
        <w:sdt>
          <w:sdtPr>
            <w:tag w:val="secondTag"/>
            <w:id w:val="-1480071868"/>
          </w:sdtPr>
          <w:sdtEndPr/>
          <w:sdtContent>
            <w:p w14:paraId="43DA0FC0" w14:textId="77777777" w:rsidR="00716CCA" w:rsidRDefault="00B25FEF">
              <w:r>
                <w:t>report-Part2: Second value</w:t>
              </w:r>
            </w:p>
          </w:sdtContent>
        </w:sdt>
        <w:p w14:paraId="1F9B0122" w14:textId="77777777" w:rsidR="00E60EE9" w:rsidRDefault="00B25FEF">
          <w:r>
            <w:t>End of template 'report-Part2.docx'</w:t>
          </w:r>
        </w:p>
        <w:p w14:paraId="18873AAA" w14:textId="77777777" w:rsidR="00E60EE9" w:rsidRDefault="00B25FEF">
          <w:r>
            <w:br w:type="page"/>
          </w:r>
        </w:p>
        <w:p w14:paraId="16E23FE9" w14:textId="77777777" w:rsidR="003C3D2D" w:rsidRDefault="00B25FEF">
          <w:r>
            <w:lastRenderedPageBreak/>
            <w:t>Start of template 'report-Part3.docx'</w:t>
          </w:r>
        </w:p>
        <w:sdt>
          <w:sdtPr>
            <w:tag w:val="firstTag"/>
            <w:id w:val="780077040"/>
          </w:sdtPr>
          <w:sdtEndPr/>
          <w:sdtContent>
            <w:p w14:paraId="00BA914F" w14:textId="77777777" w:rsidR="003C3D2D" w:rsidRDefault="00B25FEF">
              <w:r>
                <w:t>report-Part3: First value</w:t>
              </w:r>
            </w:p>
          </w:sdtContent>
        </w:sdt>
        <w:sdt>
          <w:sdtPr>
            <w:tag w:val="secondTag"/>
            <w:id w:val="-823814304"/>
          </w:sdtPr>
          <w:sdtEndPr/>
          <w:sdtContent>
            <w:p w14:paraId="10653801" w14:textId="77777777" w:rsidR="003C3D2D" w:rsidRDefault="00B25FEF">
              <w:r>
                <w:t>report-Part3: Second value</w:t>
              </w:r>
            </w:p>
          </w:sdtContent>
        </w:sdt>
        <w:p w14:paraId="1622299A" w14:textId="77777777" w:rsidR="00E60EE9" w:rsidRDefault="00B25FEF">
          <w:r>
            <w:t>End of template 'report-Part3.docx'</w:t>
          </w:r>
        </w:p>
        <w:sectPr w:rsidR="00E60EE9">
          <w:pgSz w:w="12240" w:h="15840"/>
          <w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="708" w:footer="708" w:gutter="0"/>
          <w:cols w:space="708"/>
          <w:docGrid w:linePitch="360"/>
        </w:sectPr>
      </w:body>
    </w:document>
    

    Word adds the w:sectPr element, which you don't have to add (unless you want a specific page layout). It also adds the w:lastRenderedPageBreaks which are not required. Further, the attributes added to the w:p (Paragraph) elements and the elements (e.g., w:id, w:sdtEndPr) added to the w:sdt element are optional.