Search code examples
pythonpython-docx

highlight text using python-docx


I want to highlight text in docx and save it another file. here is my code

from docx import Document

def highlight_text(filename):

    doc = Document(filename)
    for p in doc.paragraphs:
        if 'vehicle' in p.text:
            inline = p.runs
            # print(inline)
            # Loop added to work with runs (strings with same style)
            for i in range(len(inline)):
                # print((inline[i].text).encode('ascii'))
                if 'vehicle' in inline[i].text:
                    x=inline[i].text.split('vehicle')
                    inline[i].clear()
                    for j in range(len(x)-1):
                        inline[i].add_text(x[j])
                        y=inline[i].add_text('vehicle')
                        y.highlight_color='YELLOW'
            # print (p.text)

    doc.save('t2.docx')
    return 1
if __name__ == '__main__':

    highlight_text('t1.docx')

word is not getting highlighted what i am doing wrong.


Solution

  • Highlighting is an attribute of a font, not a run directly. Also, Run.add_text() returns a _Text object, not a run.

    from docx.enum.text import WD_COLOR_INDEX
    
    for paragraph in document.paragraphs:
        if 'vehicle' in paragraph.text:
            for run in paragraph.runs:
                if 'vehicle' in run.text:
                    x = run.text.split('vehicle')
                    run.clear()
                    for i in range(len(x)-1):
                        run.add_text(x[i])
                        run.add_text('vehicle')
                        run.font.highlight_color = WD_COLOR_INDEX.YELLOW
    

    Also, a highlight is applied to the entire run, so you need to create a separate runs for each of the text before "vehicle", the "vehicle" word itself, and the text after "vehicle".

    Also, there's no guarantee that a given word appears completely within a single run; runs often split within a word. So you'll need to be more sophisticated in your approach to handle the general case.

    So there's quite a bit more work to do here, but this should get you seeing at least some yellow highlighting :)