Search code examples
rofficer

Doesn't work body_replace_all_text() method in package "officer"


The code below doesn't work

library(officer)
library(magrittr)

read_docx("/home/user/document.docx") %>%
  body_replace_all_text("placeholder1", "text1") %>%
  print(target = "/home/user/out.docx")

output:

Found 0 instances of 'placeholder1' in the document.

But if I use string "tjsdhgudfhgku" instead "placeholder1" it works.

document.docx:

tjsdhgudfhgku
placeholder1 blahblahblah
blah-blah

Why so?


Solution

  • The following explanation is copied from the help file of the function:

    [...] Note that the behind-the-scenes representation of text in a Word document is frequently not what you might expect! Sometimes a paragraph of text is broken up (or "chunked") into several "runs," as a result of style changes, pauses in text entry, later revisions and edits, etc. If you have not styled the text, and have entered it in an "all-at-once" fashion, e.g. by pasting it or by outputing it programmatically into your Word document, then this will likely not be a problem. If you are working with a manually-edited document, however, this can lead to unexpected failures to find text.

    You can use the officer function docx_show_chunk to show how the paragraph of text at the current cursor has been chunked into runs, and what text is in each chunk. This can help troubleshoot unexpected failures to find text. [...]