Search code examples
c#ms-wordopenxmlopenxml-sdkwordprocessingml

How can I remove empty paragraph elements?


I am trying to remove paragraphs that contains "{Some Text}". The method below does just that, but I noticed that after I remove the paragraphs, there are empty paragraph elements left over.

How can I remove <w:p /> elements programmatically?

Below is what I initially used to remove paragraphs.

 using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(file, true))
        {
            MainDocumentPart mainPart = wordDoc.MainDocumentPart;
            Document D = mainPart.Document;

            foreach (Paragraph P in D.Descendants<Paragraph>())
            {
                if (P.InnerText.Contains("{SomeText}"))
                {
                    P.RemoveAllChildren();
                    //P.Remove();   //doesn't remove
                }
            }
            D.Save();
        }

This is how the document.xml looks like afterwords:

<w:p />
<w:p />
<w:p />
<w:p />
<w:p />
<w:p />
<w:p />

Solution

  • The problem here:

            foreach (Paragraph P in D.Descendants<Paragraph>())
            {
                if (P.InnerText.Contains("{SomeText}"))
                {
                    P.Remove();   //doesn't remove
                }
            }
    

    Is that you are trying to remove an item from the collection while you are still iterating it. For some strange reason, the OpenXML SDK doesn't actually throw an exception here, it just silently quits the foreach loop. Attaching a debugger and stepping through will show you that. The fix is simple:

            foreach (Paragraph P in D.Descendants<Paragraph>().ToList())
            {
                if (P.InnerText.Contains("{SomeText}"))
                {
                    P.Remove();   //will now remove
                }
            }
    

    By adding ToList() you are copying (shallow copy) the paragraphs to a separate list and iterating through that list. Now when you remove a paragraph it is removed from the D.Descendants<Paragraph>() collection, but not from your list and the iteration will continue.