Search code examples
google-apps-scripthyperlinkgoogle-docs

Get the first hyperlink and its text value


I hope everyone is in good health health and condition. Recently, I have been working on Google Docs hyperlinks using app scripts and learning along the way. I was trying to get all hyperlink and edit them and for that I found an amazing code from this post. I have read the code multiple times and now I have a good understanding of how it works.
My confusion
My confusion is the recursive process happening in this code, although I am familiar with the concept of Recursive functions but when I try to modify to code to get only the first hyperlink from the document, I could not understand it how could I achieve that without breaking the recursive function.
Here is the code that I am trying ;

/**
 * Get an array of all LinkUrls in the document. The function is
 * recursive, and if no element is provided, it will default to
 * the active document's Body element.
 *
 * @param {Element} element The document element to operate on. 
 * .
 * @returns {Array}         Array of objects, vis
 *                              {element,
 *                               startOffset,
 *                               endOffsetInclusive, 
 *                               url}
 */
function getAllLinks(element) {
  var links = [];
  element = element || DocumentApp.getActiveDocument().getBody();
  
  if (element.getType() === DocumentApp.ElementType.TEXT) {
    var textObj = element.editAsText();
    var text = element.getText();
    var inUrl = false;
    for (var ch=0; ch < text.length; ch++) {
      var url = textObj.getLinkUrl(ch);
      if (url != null) {
        if (!inUrl) {
          // We are now!
          inUrl = true;
          var curUrl = {};
          curUrl.element = element;
          curUrl.url = String( url ); // grab a copy
          curUrl.startOffset = ch;
        }
        else {
          curUrl.endOffsetInclusive = ch;
        }          
      }
      else {
        if (inUrl) {
          // Not any more, we're not.
          inUrl = false;
          links.push(curUrl);  // add to links
          curUrl = {};
        }
      }
    }
    if (inUrl) {
      // in case the link ends on the same char that the element does
      links.push(curUrl); 
    }
  }
  else {
    var numChildren = element.getNumChildren();
    for (var i=0; i<numChildren; i++) {
      links = links.concat(getAllLinks(element.getChild(i)));
    }
  }

  return links;
}


I tried adding

if (links.length > 0){
     return links;
}

but it does not stop the function as it is recursive and it return back to its previous calls and continue running. Here is the test document along with its script that I am working on.
https://docs.google.com/document/d/1eRvnR2NCdsO94C5nqly4nRXCttNziGhwgR99jElcJ_I/edit?usp=sharing

I hope you will understand what I am trying to convey, Thanks for giving a look at my post. Stay happy :D


Solution

  • I believe your goal as follows.

    • You want to retrieve the 1st link and the text of link from the shared Document using Google Apps Script.
    • You want to stop the recursive loop when the 1st element is retrieved.

    Modification points:

    • I tried adding

        if (links.length > 0){
             return links;
        }
      
    • but it does not stop the function as it is recursive and it return back to its previous calls and continue running.

    About this, unfortunately, I couldn't understand where you put the script in your script. In this case, I think that it is required to stop the loop when links has the value. And also, it is required to also retrieve the text. So, how about modifying as follows? I modified 3 parts in your script.

    Modified script:

    function getAllLinks(element) {
      var links = [];
      element = element || DocumentApp.getActiveDocument().getBody();
      
      if (element.getType() === DocumentApp.ElementType.TEXT) {
        var textObj = element.editAsText();
        var text = element.getText();
        var inUrl = false;
        for (var ch=0; ch < text.length; ch++) {
    
          if (links.length > 0) break; // <--- Added
    
          var url = textObj.getLinkUrl(ch);
          if (url != null) {
            if (!inUrl) {
              // We are now!
              inUrl = true;
              var curUrl = {};
              curUrl.element = element;
              curUrl.url = String( url ); // grab a copy
              curUrl.startOffset = ch;
            }
            else {
              curUrl.endOffsetInclusive = ch;
            }          
          }
          else {
            if (inUrl) {
              // Not any more, we're not.
              inUrl = false;
    
              curUrl.text = text.slice(curUrl.startOffset, curUrl.endOffsetInclusive + 1); // <--- Added
    
              links.push(curUrl);  // add to links
              curUrl = {};
            }
          }
        }
        if (inUrl) {
          // in case the link ends on the same char that the element does
          links.push(curUrl); 
        }
      }
      else {
        var numChildren = element.getNumChildren();
        for (var i=0; i<numChildren; i++) {
    
          if (links.length > 0) { // <--- Added  or if (links.length > 0) break;
            return links;
          }
    
          links = links.concat(getAllLinks(element.getChild(i)));
        }
      }
    
      return links;
    }
    
    • In this case, I think that if (links.length > 0) {return links;} can be modified to if (links.length > 0) break;.

    Note:

    • By the way, when Google Docs API is used, both the links and the text can be also retrieved by a simple script as follows. When you use this, please enable Google Docs API at Advanced Google services.

        function myFunction() {
          const doc = DocumentApp.getActiveDocument();
          const res = Docs.Documents.get(doc.getId()).body.content.reduce((ar, {paragraph}) => {
            if (paragraph && paragraph.elements) {
              paragraph.elements.forEach(({textRun}) => {
                if (textRun && textRun.textStyle && textRun.textStyle.link) {
                  ar.push({text: textRun.content, url: textRun.textStyle.link.url});
                }
              });
            }
            return ar;
          }, []);
          console.log(res)  // You can retrieve 1st link and test by console.log(res[0]).
        }