Search code examples
pythonpython-docx

why the loop is not printing the correct statements?


I am working on some code to check the colour of the font used in a Word document. I am using python 3.10.4 and the python-docx library (version 0.8.1.1) on PyCharm (Community Edition 2021.3.3).

The text being checked here is formatted with the 'Normal' style. The only accepted colour is the Automatic black, which python-docx prints as None as it a default colour.

When I execute my code (as shown below) the only statement that is printed is: "Normal text font colour is black." This was the result when I used a document with text in both black and red. So in this instance it should have printed, "Normal text contains unrecognised font color(s):", along with the contents of norm_misc_color.

I believe this error in the code may be due to they way None is being used in last block of loops. The sets norm_color and norm_misc_color print the correct values as required. I would like to know how I can print the correct statement under the specific conditions. Any form of help would be appreciated. If there are any questions regarding the code please ask.

import docx  # import the python-docx library
WordFile = docx.Document("state/the/file/directory/here")  # Word document file directory for python-docx 

norm_color = set()  # store all Normal style font colors in the set norm_color
norm_misc_color = set()  # store unacceptable Normal style font colors in the set norm_misc_color
for paragraph in WordFile.paragraphs:
    if "Normal" == paragraph.style.name:
        for run in paragraph.runs:
            # check for duplicates and store unique values in the set norm_color
            if run.font.color.rgb not in norm_color:
                norm_color.add(run.font.color.rgb)
                # check if font colors are unacceptable, if so, store in the set norm_misc_color
                if run.font.color.rgb is not None:
                    norm_misc_color.add(run.font.color.rgb)

    # check if all elements in norm_color are "None" 
if None in norm_color:
    # print this if all elements in norm_color are "None" 
    print("Normal text font colour is black.")
    # check if all elements in norm_color are not "None" 
elif None not in norm_color:
    # print this if all elements in norm_color are not "None" and print norm_misc_color content
    print("Normal text contains unrecognised font color(s):", norm_misc_color)
    # print this if all above conditions were not satisfied
else:
    print("Normal text font colour operation failed.")

Solution

  • Your if condition at the end of your code is only checking that there is some text in the file that uses the default font color. It doesn't exclude files that contain more than one text color, as long as the default color is included somewhere.

    There are a few different ways you can change your if/elif checks to handle the situation they way your comments say you want to. You could test that norm_color is a set that contains only None and nothing else with:

    if norm_color == {None}:
    

    Or you could check that norm_misc_color is empty (since you added all non-None colors to it):

    if not norm_misc_color: # an empty set is falsey
    

    I'd note that you probably don't need both an if and an elif, it sure seems like the intent is for all possible files to be handled one way or the other. I'm not sure what cases should fall through to the else, so you could get rid of the negated condition you're using for elif and just use else there, deleting the "operation failed" case as something that can't happen.