Search code examples
pythonms-worddocxpython-docx

python-docx bullet point size & inheriting styles


I am working on a Python 3 program using the python-docx library. It opens an existing .docx file and replaces the [[FRUIT_ITEM]] placeholder with values from a list. This can be anywhere from 1 to many items, and each item should be on its own bullet point (meaning I have to add a new paragraph for each entry, and then format it as a bullet).

The result works, but for some bizarre reason the existing bullet point in the template (created by clicking the "Create a bullet list" button) is ever so slightly smaller than those added by python-docx.


08/03/2023 Update: The same thing occurs when I remove the existing first bullet point from the template entirely, and have python-docx handle the creation of all bullets using the create_list code below. The first one will be a few pixels smaller than the rest, which makes this even weirder!


The original template:

enter image description here

The result:

enter image description here

For each paragraph and its containing "runs" of text, I am specifying the same font type (Arial) and size (10Pt) that the rest of the document uses. The code used to create a list for a paragraph object is:

def create_list(paragraph):
    list_type = "1" # apparently the ID for unordered lists in Word?
    p = paragraph._p 
    pPr = p.get_or_add_pPr() 
    numPr = OxmlElement('w:numPr') 
    numId = OxmlElement('w:numId') 
    numId.set(qn('w:val'), list_type) 
    numPr.append(numId) 
    pPr.append(numPr)

And to insert it after the current paragraph (note I am using "List Paragraph", as the "List Bullet" suggested in other threads does not exist in my document styles for some reason):

def insert_paragraph_after(paragraph, text, style="List Paragraph"):
    new_p = OxmlElement("w:p")
    paragraph._p.addnext(new_p)
    new_para = Paragraph(new_p, paragraph._parent)
    if text:
        new_para.add_run(text)
    new_para.style = style
    for run in new_para.runs:
        run.font.size = Pt(10)
        run.font.name = 'Arial'
    return new_para

And here is the code to call both of the above:

    template_variables = {
        "[[FRUIT_ITEM]]": ['Apple', 'Banana', 'Pear', 'Pineapple'],
        "[[CHEESE_TYPE]]": ['mild', 'sharp']
    }

    template_document = Document("my_word_file.docx")

    for variable_key, variable_value in template_variables.items():
        for paragraph in template_document.paragraphs:
            if variable_key == "[[FRUIT_ITEM]]":
                if variable_key in paragraph.text:
                    inline = paragraph.runs
                    for item in inline:
                        if variable_key in item.text:
                            # Replace the existing bullet point with the first fruit 
                            item.text = item.text.replace(variable_key, variable_value[0])

                    # Add new lines for any remaining fruit in the list, skipping the first
                    if len(variable_value) > 1:
                        for fruit in variable_value[1:]:
                            new_bullet = insert_paragraph_after(paragraph, fruit)
                            create_list(new_bullet)

I believe the solution lies in the new bullets inheriting the style of the existing one, but I'm at a loss where the formatting difference originates from. When examining the resulting file in Word, the font size and bullet styles are the same for both, even though they're clearly visually different.

Thanks in advance for any attempts to save my sanity!


Solution

  • Solved it - the cause was some formatting quirk in the original template file that I used (which was created by a third party). I recreated the template from scratch using a new blank MS Word document, applied the needed formatting manually and then saved it. The resulting file, once processed by the script, displays all bullet points in the same size.