Search code examples
google-apps-scriptgoogle-docsgoogle-docs-api

How to replace a document fragment (which matches a regular expression) with "deleted"?


How can I replace a fragment of a document (which matches a regular expression) with "deleted", provided that within the fragment (for example, between {{ and }}) there can be both plain text and images and other formats of information.

Input:

... 
{{ 
lorem ipsum
*image*
lorem ipsum
}}
...

Input image

Output:

...
deleted
...

Output image

How I tried it (this only works with text):

function myFunction() {
  const replaceText = "deleted"; // This is from your question.

  const doc = DocumentApp.getActiveDocument();
  const matches = doc.getBody().getText().match(/\{\{[\s\S\w]+?\}\}/g);
  if (!matches || matches.length == 0) return;
  const requests = matches.map(text => ({ replaceAllText: { containsText: { matchCase: false, text }, replaceText } }));
  Docs.Documents.batchUpdate({ requests }, doc.getId());
}

Solution

  • Issue and workaround:

    In your situation, when the images are included in the range enclosed by {{ and }}, your showing script cannot be used. Because the texts including the images cannot be replaced with replaceAllText. I think that this is the reason for your current issue.

    In order to achieve your goal, it is required to detect the index of {{ and }}. From your showing sample image, in this answer, it supposes that {{ and }} are the separated paragraphs. Please be careful about this.

    When this is reflected in a sample script, it becomes as follows.

    Sample script:

    Before you use this script, please enable Google Docs API at Advanced Google services.

    function myFunction() {
      const replaceText = "deleted"; // This is from your question.
    
      const doc = DocumentApp.getActiveDocument();
      const docId = doc.getId();
      const ranges = Docs.Documents.get(docId).body.content.reduce((ob, o) => {
        if (o.paragraph && o.paragraph.elements.length > 0) {
          o.paragraph.elements.forEach(oo => {
            if (oo.textRun && oo.textRun.content) {
              const value = oo.textRun.content.trim();
              if (value == "{{") {
                ob.temp = o;
              } else if (ob.temp && value == "}}") {
                ob.ar.push([ob.temp.startIndex + 2, o.endIndex - 3]);
                ob.temp = null;
              }
            }
          });
        }
        return ob;
      }, { ar: [], temp: null }).ar.reverse();
      if (ranges.length > 0) {
        const requests = ranges.map(([startIndex, endIndex]) => ({ deleteContentRange: { range: { startIndex, endIndex } } }));
        Docs.Documents.batchUpdate({ requests }, docId);
        doc.saveAndClose();
      }
      DocumentApp.getActiveDocument().getBody().replaceText("{{.*}}", replaceText);
    }
    
    • When this script is run, first, the index of {{ and }} is retrieved. And, the content ranges between {{ and }} are deleted by Docs API. And also, the replacement text is put.

    Note:

    • From your showing sample image, in this answer, it supposes that the texts of {{ and }} are the separated paragraphs. Please be careful about this.

    References: