I am currently working on a project to manipulate Docx file with the Apache POI project. I have used the api to remove text from a run inside of a text box, but cannot figure out how to remove a paragraph inside a text box. I assume that I need to use the class CTP to obtain the paragraph object to remove. Any examples or suggestion would be greatly appreciated.
In Replace text in text box of docx by using Apache POI I have shown how to replace text in Word
text-box-contents. The approach is getting a list of XML
text run elements from the XPath
.//*/w:txbxContent/w:p/w:r
using a XmlCursor
which selects that path from /word/document.xml
.
The same of course can be done using the path .//*/w:txbxContent/w:p
, which gets the text paragraphs in text-box-contents. Having those low level paragraph XML
, we can converting them into XWPFParagraph
s to get the plain text out of them. Then, if the plain text contains some criterion, we can simply removing the paragraph's XML
.
Source:
Code:
import java.io.FileOutputStream;
import java.io.FileInputStream;
import org.apache.poi.xwpf.usermodel.*;
import org.apache.xmlbeans.XmlObject;
import org.apache.xmlbeans.XmlCursor;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import java.util.List;
import java.util.ArrayList;
public class WordRemoveParagraphInTextBox {
public static void main(String[] args) throws Exception {
XWPFDocument document = new XWPFDocument(new FileInputStream("WordRemoveParagraphInTextBox.docx"));
for (XWPFParagraph paragraph : document.getParagraphs()) {
XmlCursor cursor = paragraph.getCTP().newCursor();
cursor.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' .//*/w:txbxContent/w:p");
List<XmlObject> ctpsintxtbx = new ArrayList<XmlObject>();
while(cursor.hasNextSelection()) {
cursor.toNextSelection();
XmlObject obj = cursor.getObject();
ctpsintxtbx.add(obj);
}
for (XmlObject obj : ctpsintxtbx) {
CTP ctp = CTP.Factory.parse(obj.xmlText());
//CTP ctp = CTP.Factory.parse(obj.newInputStream());
XWPFParagraph bufferparagraph = new XWPFParagraph(ctp, document);
String text = bufferparagraph.getText();
if (text != null && text.contains("remove")) {
obj.newCursor().removeXml();
}
}
}
FileOutputStream out = new FileOutputStream("WordRemoveParagraphInTextBoxNew.docx");
document.write(out);
out.close();
document.close();
}
}
Result: