Search code examples
vbaformstextboxms-word

How to replace Fields in Word document with their content using VBA?


Some sites use textarea to publish code in articles. If someone copy/paste the article in Word, it shows empty textarea with scrollbars and below the code in a table with numbered lines.
I want to replace it with just code (or with just the table, which I can successfully convert to text), by removing the textarea.

Did try to do it like this

Sub RemoveTextBoxes()     
    Dim oFld    As Word.FormField
     
    With Application.ActiveDocument
        ' \\ TextInput Type requires to unprotect the document
        If .ProtectionType <> wdNoProtection Then .Unprotect
         
        ' \\ Loop all formfields in active document
        For Each oFld In .FormFields()
             
            ' \\ Only remove Formfield textboxes that have textinput only
            If oFld.Type = wdFieldFormTextInput And oFld.TextInput.Type = wdRegularText Then
             
                ' \\ Delete
                oFld.Delete
            End If
        Next
         
        ' \\ Reprotect the document
        .Protect wdAllowOnlyFormFields, True
    End With  
End Sub

If I press Alt+F9 (displays field codes) I do see now

{ HTMLCONTROL Forms.HTML :TextArea.1 } 

above the text box with scrollbars! If I close and open up again, it's still here.

How do I get this TextArea content and remove|replace the element with the content?


Solution

  • Dynamic content in Word is managed using "fields". Not all fields that accept input are "form fields", as you discovered when using Alt+F9 do display the field codes.

    Word's Find / Replace functionality is quite powerful: it can also be used to find fields, even specific fields. In this case, since you simply want them removed, the HTMLControl fields can be found and replaced with "nothing". (If you want to be more specific and leave some HTMLControl fields, use as much text as necessary to remove only those fields.)

    Many people don't realize it, but you can search field codes without needing to display them. Find can also work with field results displayed. The trick is to set the Range.TextRetrievalMode to include field codes (and, in this case, I think also inlcuding hidden text is a good idea, but if that's a problem, comment out or delete that line).

    The ^d in the search text represents the opening field bracket: { - if this were left out only what is inside the brackets would be replaced (deleted), which I don't recommend. With ^d the entire field - including the closing bracket - is affected.

    Sub FindAndDeleteHtmlFields()
        Dim doc As word.Document
        Dim fld As word.Field
        Dim rngFind As word.Range
    
        Set doc = ActiveDocument
        Set rngFind = doc.content
        rngFind.TextRetrievalMode.IncludeFieldCodes = True
        rngFind.TextRetrievalMode.IncludeHiddenText = True
        With rngFind.Find
            .Text = "^d HTMLControl"
            .ClearFormatting
            .Replacement.Text = ""
            .Execute Replace:=wdReplaceAll
        End With
    End Sub
    

    Note that this also ports to C# - I have the impression that's actually where you're working...