Search code examples
textboxapache-poixwpf

Could someone share how to delete a paragraph form a textbox


I am currently working on a project to manipulate Docx file with the Apache POI project. I have used the api to remove text from a run inside of a text box, but cannot figure out how to remove a paragraph inside a text box. I assume that I need to use the class CTP to obtain the paragraph object to remove. Any examples or suggestion would be greatly appreciated.


Solution

  • In Replace text in text box of docx by using Apache POI I have shown how to replace text in Word text-box-contents. The approach is getting a list of XML text run elements from the XPath .//*/w:txbxContent/w:p/w:r using a XmlCursor which selects that path from /word/document.xml.

    The same of course can be done using the path .//*/w:txbxContent/w:p, which gets the text paragraphs in text-box-contents. Having those low level paragraph XML, we can converting them into XWPFParagraphs to get the plain text out of them. Then, if the plain text contains some criterion, we can simply removing the paragraph's XML.

    Source:

    enter image description here

    Code:

    import java.io.FileOutputStream;
    import java.io.FileInputStream;
    
    import org.apache.poi.xwpf.usermodel.*;
    
    import org.apache.xmlbeans.XmlObject;
    import org.apache.xmlbeans.XmlCursor;
    
    import  org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
    
    import java.util.List;
    import java.util.ArrayList;
    
    public class WordRemoveParagraphInTextBox {
    
     public static void main(String[] args) throws Exception {
    
      XWPFDocument document = new XWPFDocument(new FileInputStream("WordRemoveParagraphInTextBox.docx"));
    
      for (XWPFParagraph paragraph : document.getParagraphs()) {
       XmlCursor cursor = paragraph.getCTP().newCursor();
       cursor.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' .//*/w:txbxContent/w:p");
    
       List<XmlObject> ctpsintxtbx = new ArrayList<XmlObject>();
    
       while(cursor.hasNextSelection()) {
        cursor.toNextSelection();
        XmlObject obj = cursor.getObject();
        ctpsintxtbx.add(obj);
       }
       for (XmlObject obj : ctpsintxtbx) {
        CTP ctp = CTP.Factory.parse(obj.xmlText());
        //CTP ctp = CTP.Factory.parse(obj.newInputStream());
        XWPFParagraph bufferparagraph = new XWPFParagraph(ctp, document);
        String text = bufferparagraph.getText();
        if (text != null && text.contains("remove")) {
         obj.newCursor().removeXml();
        }
       }
      }
    
      FileOutputStream out = new FileOutputStream("WordRemoveParagraphInTextBoxNew.docx");
      document.write(out);
      out.close();
      document.close();
     }
    }
    

    Result:

    enter image description here