Search code examples
ms-wordapache-poicomments

How to delete comments and their references in a Word document by poi


In my project, I'm using Word as a template. My supervisor will modify this template and make comment in it. When the program generates the final document, it needs to remove all the comment from the template. Below is the code that I wrote. However, it only deletes part of the comment content, not completely.

package com.office;

import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.List;

public class BookmarkSetter {

    public static void main(String[] args) throws IOException {
        FileInputStream fis = new FileInputStream("f:\\test.docx");
        XWPFDocument document = new XWPFDocument(fis);

        List<XWPFParagraph> paragraphList = document.getParagraphs();
        for (XWPFParagraph paragraph : paragraphList) {

            List<CTMarkupRange> start = paragraph.getCTP().getCommentRangeStartList();
            for (int i = start.size() - 1; i >= 0; --i) {
                paragraph.getCTP().removeCommentRangeStart(i);
            }
            List<CTMarkupRange> end = paragraph.getCTP().getCommentRangeEndList();
            for (int i = end.size() - 1; i >= 0; --i) {
                paragraph.getCTP().removeCommentRangeEnd(i);
            }
        }
        // Save changes
        FileOutputStream out = new FileOutputStream("f:\\test1.docx");
        document.write(out);
        out.close();
    }
}

input file output file


Solution

  • Word comments are stored like so:

    In /word/document.xml:

    ...
    <w:p ...
     ...
     <w:commentRangeStart w:id="..."/>
    ...
    <w:p ...
     ...
     <w:commentRangeEnd w:id="..."/>
    ...
     <w:r ...
     ...
      <w:commentReference w:id="..."/>
     </w:r>
    ...
    

    There commentRangeStart and commentRangeEnd mark the text part which is commented. And commentReference, which is in a run-element, points to the comment which is stored in /word/comments.xml.

    Your code already removes all commentRangeStart and commentRangeEnd marks but it lacks removing the commentReferences.

    Thus:

    ...
      for (XWPFParagraph paragraph : document.getParagraphs()) {
       // remove all comment range start marks
       for (int i = paragraph.getCTP().getCommentRangeStartList().size() - 1; i >= 0; i--) {
        paragraph.getCTP().removeCommentRangeStart(i);
       }
       // remove all comment range end marks
       for (int i = paragraph.getCTP().getCommentRangeEndList().size() - 1; i >= 0; i--) {
        paragraph.getCTP().removeCommentRangeEnd(i);
       }
       // remove all comment references
       for (int i = paragraph.getRuns().size() - 1; i >= 0; i--) {
        XWPFRun run = paragraph.getRuns().get(i);   
        if (run.getCTR().getCommentReferenceList().size() > 0) {
         paragraph.removeRun(i);    
        }
       }
      }
    ...
    

    But additionally the comments in /word/comments.xml also should be removed then:

    That would be:

    ...
      // remove all document comments
      XWPFComments comments = document.getDocComments();
       for (int i = comments.getComments().size() - 1; i >= 0; i--) {
        comments.removeComment(i);
      }   
    ...