Search code examples
pythonpython-2.7python-docx

Python docx Replace string in paragraph while keeping style


I need help replacing a string in a word document while keeping the formatting of the entire document.

I'm using python-docx, after reading the documentation, it works with entire paragraphs, so I loose formatting like words that are in bold or italics. Including the text to replace is in bold, and I would like to keep it that way. I'm using this code:

from docx import Document
def replace_string2(filename):
    doc = Document(filename)
    for p in doc.paragraphs:
        if 'Text to find and replace' in p.text:
            print 'SEARCH FOUND!!'
            text = p.text.replace('Text to find and replace', 'new text')
            style = p.style
            p.text = text
            p.style = style
    # doc.save(filename)
    doc.save('test.docx')
    return 1

So if I implement it and want something like (the paragraph containing the string to be replaced loses its formatting):

This is paragraph 1, and this is a text in bold.

This is paragraph 2, and I will replace old text

The current result is:

This is paragraph 1, and this is a text in bold.

This is paragraph 2, and I will replace new text


Solution

  • I posted this question (even though I saw a few identical ones on here), because none of those (to my knowledge) solved the issue. There was one using a oodocx library, which I tried, but did not work. So I found a workaround.

    The code is very similar, but the logic is: when I find the paragraph that contains the string I wish to replace, add another loop using runs. (this will only work if the string I wish to replace has the same formatting).

    def replace_string(filename):
        doc = Document(filename)
        for p in doc.paragraphs:
            if 'old text' in p.text:
                inline = p.runs
                # Loop added to work with runs (strings with same style)
                for i in range(len(inline)):
                    if 'old text' in inline[i].text:
                        text = inline[i].text.replace('old text', 'new text')
                        inline[i].text = text
                print p.text
    
        doc.save('dest1.docx')
        return 1