Search code examples
vb.netopenxml-sdk

Remove empty paragraphs from .docx using OpenXML SDK 2.0


I'm trying to remove empty paragraphs from a .docx file before parsing the content into xml. How would I achieve this?

Protected Sub removeEmptyParagraphs(ByRef body As DocumentFormat.OpenXml.Wordprocessing.Body)
    Dim colP As IEnumerable(Of Paragraph) = body.Descendants(Of Paragraph)()

    Dim count As Integer = colP.Count
    For Each p As Paragraph In colP
        If (p.InnerText.Trim() = String.Empty) Then
            body.RemoveChild(Of Paragraph)(p)
        End If
    Next
End Sub

Solution

  • The problem you might be running into is removing items from a list in a for each block. You could try using linq and the RemoveAll method:

    Protected Sub removeEmptyParagraphs(ByRef body As DocumentFormat.OpenXml.Wordprocessing.Body)
        Dim colP As IEnumerable(Of Paragraph) = body.Descendants(Of Paragraph)()
        colP.RemoveAll(Function(para) para.InnerText.Trim() = String.Empty)
    End Sub