Search code examples
javams-wordapache-poibulletedlist

Creating nested bullet lists in Word using POI v5


I am working in Java, using the following Maven dependency (and no others):

  <!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml -->
  <dependencies>
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml</artifactId>
        <version>5.0.0</version>
    </dependency>
  </dependencies>

and the following class, gleaned from another SO post:

import java.io.FileOutputStream;
import java.math.BigInteger;
import java.util.ArrayList;
import java.util.Arrays;

import org.apache.poi.xwpf.usermodel.XWPFAbstractNum;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFNumbering;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTAbstractNum;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTLvl;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STNumberFormat;

public class CreateSimpleWordBulletList
{

  public static void main(String[] args) throws Exception
  {

    CTAbstractNum cTAbstractNum = CTAbstractNum.Factory.newInstance();
    
    // Next we set the AbstractNumId. This requires care.
    // Since we are in a new document we can start numbering from 0.
    // But if we have an existing document, we must determine the next free
    // number first.
    cTAbstractNum.setAbstractNumId(BigInteger.valueOf(0));

    // Bullet list
    CTLvl cTLvl = cTAbstractNum.addNewLvl();
    cTLvl.addNewNumFmt().setVal(STNumberFormat.BULLET);
    cTLvl.addNewLvlText().setVal("•");

    XWPFAbstractNum abstractNum = new XWPFAbstractNum(cTAbstractNum);
    XWPFDocument document = new XWPFDocument();
    XWPFNumbering numbering = document.createNumbering();

    BigInteger abstractNumID = numbering.addAbstractNum(abstractNum);
    BigInteger numID = numbering.addNum(abstractNumID);

    XWPFParagraph paragraph = document.createParagraph();

    XWPFRun run = paragraph.createRun();
    run.setText("A list having defined gap between bullet point and text:");

    ArrayList<String> documentList = new ArrayList<String>(Arrays.asList(new String[] { "One", "Two", "Three" }));
    for (String string : documentList)
    {
      paragraph = document.createParagraph();
      paragraph.setNumID(numID);
      // set indents in Twips (twentieth of an inch point, 1440 Twips = 1 inch
      paragraph.setIndentFromLeft(1440 / 4); // indent from left 360 Twips = 1/4
                                             // inch
      paragraph.setIndentationHanging(1440 / 4); // indentation hanging 360
                                                 // Twips = 1/4 inch
                                                 // so bullet point hangs 1/4
                                                 // inch before the text at
                                                 // indentation 0
      run = paragraph.createRun();
      run.setText(string);
    }

    paragraph = document.createParagraph();

    FileOutputStream out = new FileOutputStream("CreateWordSimplestBulletList.docx");
    document.write(out);
    out.close();
    document.close();

  }
}

This creates a bullet list, as I want; I hope to indent it, but that's secondary.

The real modification I need to make to it is to add two more levels of lists, so that the result, in the Word document, resembles the following:

* One
    - AAA
        o aaa
        o bbb
        o ccc
    - BBB
        o xyz
        o abc
* Two
    - AAA
        o mmm
        o nnn
    - ZZZ
        o bbb
        o nnn

etc.

I hope it won't be necessary for someone to write code for me for this, but I do not understand nor find any documentation for CTAbstractNum, CTLvl, or the XWPF* classes. If there is documentation on those that would suffice, then someone could just point me to that.

I gather from other the comment here that the value set in CTAbstractNum.setAbstractNumId() identifies the numbering level document-wide, so that it should be applied to all the items at the outermost level, and that, conceptually, another such ID would be applied to all items at the same level within each bullet (e.g., the 'aaa', 'bbb', 'ccc' strings in the above illustration). I'm guessing different IDs would be created and applied to each such internal list. But I hate guessing about that when I don't know anything about the conceptual model for the API. Trial-and-error programming is such a bore.


Solution

  • I have provided multiple answers for how to create Word numberings already. Also for how to create multilevel numberings. For example: Apache poi multiline bullet point is working but not multiple paragaraph? and How can I add List in poi word, ordered number or other symbol for list symbol?.

    But as you are asking for documentstion too, I will try to shed more light on this.

    Modern Word documents (*.docx) are using Office Open XML file format. The format was initially standardized by the Ecma (as ECMA-376), and by the ISO and IEC (as ISO/IEC 29500) in later versions. It is a ZIP archive containing XML and other files in a special directory structure. So one easily can unzip a *.docx file and have a look into the internals.

    Basing in those standardisations, apache has developed XML beans in org.openxmlformats.schemas classes. Up to apache poi version 4 those classes were shipped in ooxml-schemas. From apache poi version 5 on they are in poi-ooxml-full. There is a poi-ooxml-lite version too. But this does only contain those beans which are used by the high level apache poi classes. So it lacks some beans when it comes to more special use cases.

    Unfortunately there is no documentation for the org.openxmlformats.schemas classes public available. But of course one can download the sources and do javadoc from those sources to have at least the API documentation.

    XWPF is apache poi's high level implementation of the Office Open XML part used by Microsoft Word. It uses the org.openxmlformats.schemas beans to implement more convenient methods. It is documented in https://poi.apache.org/: Apache POI - Javadocs, POI-XWPF - A Quick Guide. But XWPF does not contain all the possibilities and features of Microsoft Word up to now. So for some special use cases one need knowledge about the XML and the usage of the org.openxmlformats.schemas beans.

    Since *.docx is simply a ZIP archive, the most expedient way will be creating a simple *.docx file using Microsoft Word itself and then unzip the *.docx to get what XML was been created. Then trying to recreate that XML using XWPF and the org.openxmlformats.schemas beans.

    When it comes to numberings one will find that there is a /word/numbering.xml in the *.docx ZIP archive which contains the XML for the numbering definitions. Each definition consists on a abstractNum containing the definition even for multiple indent levels of a numbering and a num which links to that abstractNum. The num has a numId which gets used in /word/document.xml to mark those paragraphs which are contained in a enumeration. The paragraphs also might have a ilv (indent level) which tells how deep they are indendet in that enumeration.

    XWPF does not fully provide creating all features of the abstractNum in the numbering. It provides XWPFAbstractNum which has a constructor taking a org.openxmlformats.schemas.wordprocessingml.x2006.main.CTAbstractNum. So that CTAbstractNum needs to be created using the low level beans. Simplest way is creating it from XML given as String. The XML one can get by creating a simple *.docx file having a numbering using Microsoft Word itself and then unzip the *.docx ZIP archive.

    If one is able to read XML then this XML will be self explaining. The elements are well named. What one needs to know is that measurement units of indention and hangíng is twips (twentieth of an inch point). And that the symbols used for bullet points sonetimes comes from the additional Windows fonts Symbol and/or Wingdings. Those fonts also needs to be set in the XML then. And the values are ASCII values which uses the special glyphs of those special fonts then.

    Following complete example shows this. It creates your showed enumeration one time as a bullet list and one time as a numbered list.

    import java.io.FileOutputStream;
    
    import org.apache.poi.xwpf.usermodel.*;
    
    import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTAbstractNum;
    import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTNumbering;
    
    import java.math.BigInteger;
    
    import java.util.Map; 
    import java.util.TreeMap; 
    
    public class CreateWordMultilevelLists {
    
     static String cTAbstractNumBulletXML = 
      "<w:abstractNum xmlns:w=\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\" w:abstractNumId=\"0\">"
    + "<w:multiLevelType w:val=\"hybridMultilevel\"/>"
    + "<w:lvl w:ilvl=\"0\"><w:start w:val=\"1\"/><w:numFmt w:val=\"bullet\"/><w:lvlText w:val=\"\uF0B7\"/><w:lvlJc w:val=\"left\"/><w:pPr><w:ind w:left=\"720\" w:hanging=\"360\"/></w:pPr><w:rPr><w:rFonts w:ascii=\"Symbol\" w:hAnsi=\"Symbol\" w:hint=\"default\"/></w:rPr></w:lvl>"
    + "<w:lvl w:ilvl=\"1\" w:tentative=\"1\"><w:start w:val=\"1\"/><w:numFmt w:val=\"bullet\"/><w:lvlText w:val=\"\u2013\"/><w:lvlJc w:val=\"left\"/><w:pPr><w:ind w:left=\"1440\" w:hanging=\"360\"/></w:pPr><w:rPr><w:rFonts w:ascii=\"Courier New\" w:hAnsi=\"Courier New\" w:cs=\"Courier New\" w:hint=\"default\"/></w:rPr></w:lvl>"
    + "<w:lvl w:ilvl=\"2\" w:tentative=\"1\"><w:start w:val=\"1\"/><w:numFmt w:val=\"bullet\"/><w:lvlText w:val=\"\u26Ac\"/><w:lvlJc w:val=\"left\"/><w:pPr><w:ind w:left=\"2160\" w:hanging=\"360\"/></w:pPr><w:rPr><w:rFonts w:ascii=\"Courier New\" w:hAnsi=\"Courier New\" w:hint=\"default\"/></w:rPr></w:lvl>"
    + "</w:abstractNum>";   
    
     static String cTAbstractNumDecimalXML = 
      "<w:abstractNum xmlns:w=\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\" w:abstractNumId=\"1\">"
    + "<w:multiLevelType w:val=\"hybridMultilevel\"/>"
    + "<w:lvl w:ilvl=\"0\"><w:start w:val=\"1\"/><w:numFmt w:val=\"decimal\"/><w:lvlText w:val=\"%1\"/><w:lvlJc w:val=\"left\"/><w:pPr><w:ind w:left=\"720\" w:hanging=\"360\"/></w:pPr></w:lvl>"
    + "<w:lvl w:ilvl=\"1\" w:tentative=\"1\"><w:start w:val=\"1\"/><w:numFmt w:val=\"decimal\"/><w:lvlText w:val=\"%1.%2\"/><w:lvlJc w:val=\"left\"/><w:pPr><w:ind w:left=\"1440\" w:hanging=\"360\"/></w:pPr></w:lvl>"
    + "<w:lvl w:ilvl=\"2\" w:tentative=\"1\"><w:start w:val=\"1\"/><w:numFmt w:val=\"decimal\"/><w:lvlText w:val=\"%1.%2.%3\"/><w:lvlJc w:val=\"left\"/><w:pPr><w:ind w:left=\"2160\" w:hanging=\"360\"/></w:pPr></w:lvl>"
    + "</w:abstractNum>";
    
     static BigInteger createNumbering(XWPFDocument document, String abstractNumXML) throws Exception {
      CTNumbering cTNumbering = CTNumbering.Factory.parse(abstractNumXML);
      CTAbstractNum cTAbstractNum = cTNumbering.getAbstractNumArray(0);
      XWPFAbstractNum abstractNum = new XWPFAbstractNum(cTAbstractNum);
      XWPFNumbering numbering = document.createNumbering();
      BigInteger abstractNumID = numbering.addAbstractNum(abstractNum);
      BigInteger numID = numbering.addNum(abstractNumID);
      return numID;
     }
     
     static void setIndentLevel(XWPFParagraph paragraph, BigInteger level) {
      if (paragraph.getCTP().isSetPPr()) {
       if (paragraph.getCTP().getPPr().isSetNumPr()) {
        if (paragraph.getCTP().getPPr().getNumPr().isSetIlvl()) {
         paragraph.getCTP().getPPr().getNumPr().getIlvl().setVal(level);
        } else {
         paragraph.getCTP().getPPr().getNumPr().addNewIlvl().setVal(level);
        }
       }
      }
     }
     
     static BigInteger getIndentLevelFromNumberingString(String numberingString) {
      String[] levels = numberingString.split("\\.");
      int level = levels.length -1;
      return BigInteger.valueOf(level);
     }
     
     static void insertListContent(XWPFDocument document, TreeMap<String, String> listContent, BigInteger numID) {
      for (Map.Entry<String, String> entry : listContent.entrySet()) {
       String key = entry.getKey();
       String value = entry.getValue();
       XWPFParagraph paragraph = document.createParagraph();
       paragraph.setNumID(numID);
       setIndentLevel(paragraph, getIndentLevelFromNumberingString(key));
       XWPFRun run = paragraph.createRun();
       run.setText(value); 
       if (!entry.equals(listContent.lastEntry())) paragraph.setSpacingAfter(0);
      } 
     }
    
     public static void main(String[] args) throws Exception {
         
      TreeMap<String, String> listContent = new TreeMap<String, String>();
      listContent.put("1", "One");
      listContent.put("1.1", "AAA");
      listContent.put("1.1.1", "aaa");
      listContent.put("1.1.2", "bbb");
      listContent.put("1.1.3", "ccc");
      listContent.put("1.2", "BBB");
      listContent.put("1.2.1", "xyz");
      listContent.put("1.2.2", "abc");
      listContent.put("2", "Two");
      listContent.put("2.1", "AAA");
      listContent.put("2.1.1", "mmm");
      listContent.put("2.1.2", "nnn");
      listContent.put("2.2", "ZZZ");
      listContent.put("2.2.1", "bbb");
      listContent.put("2.2.2", "nnn");
      
      XWPFDocument document = new XWPFDocument();
      
      BigInteger numIDBulletList = createNumbering(document, cTAbstractNumBulletXML);
      BigInteger numIDDecimalList = createNumbering(document, cTAbstractNumDecimalXML);
      
      XWPFParagraph paragraph = document.createParagraph();
      XWPFRun run=paragraph.createRun();  
      run.setText("The bullet list:");
      
      insertListContent(document, listContent, numIDBulletList);
      
      paragraph = document.createParagraph();
      run=paragraph.createRun();  
      run.setText("Paragraph after the list.");
      
      paragraph = document.createParagraph();
      run=paragraph.createRun();  
      run.setText("The decimal list:");
      
      insertListContent(document, listContent, numIDDecimalList);
      
      paragraph = document.createParagraph();
      run=paragraph.createRun();  
      run.setText("Paragraph after the list.");
    
      FileOutputStream out = new FileOutputStream("./CreateWordMultilevelLists.docx");    
      document.write(out);
      out.close();
      document.close();
    
     }
    }