Search code examples
.netvb.netopenxml

Use template to generate Word document


I have a Word Document stored on a server I wish to use as a template. I need to replace certain text with data from a database.

The two issues I have are: No office on the server to use Microsoft.Office.Interop, and I cannot save any documents to the server.

I think I am on the right track, but cannot come up with a viable solution. I am thinking my best route is to read into memory and use a byte array to allow the user to save the file.

I was doing something like this, but am currently stumped.

Dim path As String = HttpContext.Current.Request.PhysicalApplicationPath & "Letters\Test.docx"
        Dim docBA As Byte() = File.ReadAllBytes(path)

        Dim wordDoc As WordprocessingDocument = WordprocessingDocument.Open(path, True)
        Using (wordDoc)
            Dim docText As String = Nothing
            Dim sr As StreamReader = New StreamReader(wordDoc.MainDocumentPart.GetStream)

            Using (sr)
                docText = sr.ReadToEnd
            End Using

            Dim regexText As Regex = New Regex("FIRST_NAME")
            docText = regexText.Replace(docText, "TESTING!!!")

            Dim sw As StreamWriter = New StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create))
            Dim modBA As Byte()
            Using (sw)
                sw.Write(docText)
                modBA = sw.Encoding.GetBytes(sw.BaseStream, 0, sw.BaseStream.Length)
                HttpContext.Current.Response.AddHeader("content-disposition", "attachment;filename= DownloadSample.docx")
                HttpContext.Current.Response.ContentType = "application/octectstream"
                HttpContext.Current.Response.BinaryWrite(modBA)
                HttpContext.Current.Response.End()
            End Using

Solution

  • You're on the right track, working with the Open XML file format, rather than attempting to edit a document in the Word application in a server environment.

    One prolem you're going to have, however, is that you won't reliably be able to read the content as your sample code does and use RegEx. The reason is that, in the underlying Word Open XML text runs can (usually are) broken up by direct formatting commands, spelling errors, language formatting, and a myriad of other things.

    Since the purpose of your choice of RegEx is to write data to "placeholders" the better approach is to use Content Controls (std elements) as the "targets". These can be located directly and data written to them. Content controls can even be bound to nodes in a Custom XML Part embedded in the document so that you can edit that XML file, rather than the Word document. There are examples for this on MSDN as well as discussions in MSDN and other forums.