Search code examples
javascriptgoogle-apps-scriptgoogle-sheetsfetchgoogle-sheets-formula

Fetch website data using app scripts in Google sheets


Earlier I asked a question about fetching data with the =IMPORTXML function in Google Sheets. Now I've got that working but I understood that a script could work better for the reliability an usability.

The issue with the present script (see below) is that it only fetch data from one url. I want to import data from multiple url's (every country has its own unique url). I want to fetch multiple data per URL. It all concerns travel advices from the Dutch gouverment to other countries. Within my spreadsheet I've added a column with the unique URL's in A, plain tekst countries in B and the data to fetch in column C, D and E.

The data I would like to fetch

  • colour-status (Geel, Oranje, Rood) [can be found within <h3></h3>]
  • mapurl (url of the .png file with the map including travel advices) [can be found within <paragraph></paragraph>]
  • summary (summary of the present travel advice) [can be found within <summary></summary>]

I hope that you can help me to improve the script to make it easier to fetch 'traveladvice' data from the government.

function parseXml() {
  
  var url = 'https://opendata.nederlandwereldwijd.nl/v1/sources/nederlandwereldwijd/infotypes/traveladvice/a790e247-8ff8-4aa8-be93-9202ccaa180a';
  var xml = UrlFetchApp.fetch(url).getContentText();
  var document = XmlService.parse(xml);
  str = xml.split("<paragraphtitle>Veiligheidsrisico’s</paragraphtitle>")[1]
  mySubString = str.match(new RegExp("<summary>" + "(.*)" + "</summary>"))[1];
  
  return mySubString;
}

(Script from: Marios, thanks Marios!) And followed by =parseXml() within a cell in the spreadsheet to fetch the data.


Solution

  • Solution:

    You can use the following three custom functions:

    function parseXmlSummary(url) {
      
      try{
      var xml = UrlFetchApp.fetch(url).getContentText();
      mySubString = xml.match(new RegExp("<summary>" + "(.*)" + "</summary>"))[1];
      return mySubString;
      }
      catch(e){return "";}
    }
    
    function parseXmlMapURL(url) {
       
      try{
      var xml = UrlFetchApp.fetch(url).getContentText();
      mySubString = "https://"+xml.match(new RegExp("CDATA\\[https://" + "(.*)" + ".png"))[1]+".png";
      return mySubString;
      }
      catch(e){return "";}
    }
    
    
    function parseXmlColours(url) {
      
      try{
      var xml = UrlFetchApp.fetch(url).getContentText();
      str = xml.split("<paragraphtitle>Veiligheidsrisico’s</paragraphtitle>")[1]
      var colours = ['Geel', 'Oranje', 'Rood'];
      res_colour = colours.filter(c => str.includes(c))[0];
      return res_colour;
      } 
      catch(e){return "";}
    
    }
    

    and then you can call them separately as formulas like that:

    • =parseXmlSummary(A7)
    • =parseXmlMapURL(A7)
    • =parseXmlColours(A7)

    Assuming cell A7 contains the given URL.

    For all the other cells below (A8, A9, ..) you can just drag the formula down as you usually do with other formulas.