I'm creating a PowerPoint with Python pptx and my query result string contains the html '<br><br>
' and I'm trying to replace it with '\n' like:
TDsFirst = "\n" + self.TxtStringFromSQLserver.replace('<br><br>', '\n')
TDPs = TDPsFirst.replace('<br>', '\n')
TipDPsText_run.text = TDPs
This results in the lines ending with '_x000D_
'
What am I doing wrong? How can I convert the '<br>
' to returns?
This behavior is a little bit new, but is the expected behavior:
https://python-pptx.readthedocs.io/en/latest/api/text.html#pptx.text.text._Run.text
A run can only contain text. A line-break or paragraph boundary happens at a higher level. In particular, a line-break can only occur between runs, inside a paragraph. A paragraph "break" can only occur in a text-frame, between, well, paragraphs.
So depending on what you're trying to do, the solution may just be to make the assignment at the text-frame level rather than the run level as your variable-name TipDPsText_run
suggests. Line-feed characters (\n
) are accepted by TextFrame.text
and are turned into paragraph boundaries.
That may not entirely solve the problem, but it may (I give it an 90% likelihood) and will at least change the question to one that can be solved.
UPDATE: After further review of the code, in fact a newline by itself "\x0A"
is accepted by Run.text
and placed unchanged into the XML where it probably looks pretty much like a line-break. This legacy courtesy does not extend to carriage-return "\x0D"
which is rendered just as you see as "_x000D_"
. This extra CR byte is in there because you're running on Windows. Accordingly, you may be able to work around this by using "\x0A"
instead of "\n"
in your text assignment. But I recommend the text-frame level assignment as the more approach more consistent with PowerPoint behavior, where typing in a carriage-return creates a new paragraph.