Search code examples
google-sheetsweb-scrapinggoogle-sheets-formulayahoo-financexpathquery

Google Sheets ImportXML returns "Resource at URL not found" error


I am trying to pull data from Yahoo Finance into Google Sheets using the importXML function. More specifically I am trying to pull it from the "key-statistics" page on a specific stock. For some reason, whenever I try to pull a value from this specific page, I get a "resource at URL not found" error. I've tried disabling JavaScript on the page to see if that was the issue, but everything appeared fine. It's specifically on this page however, and not on others. For example, I cannot pull XML data from https://finance.yahoo.com/quote/TSLA/key-statistics?p=TSLA, however, I can pull data from https://finance.yahoo.com/quote/TSLA perfectly fine. Is it an issue with the URL being wrong or has Yahoo somehow blocked data from being pulled?

Tries to pull "Return on Assets" from Statistics page

Pulls Market Cap from main page


Solution

  • As explained here, the content of yahoo web site is built dynamically by the mean of javascript. https://webapps.stackexchange.com/questions/115664/how-to-know-if-google-sheets-importdata-importfeed-importhtml-or-importxml-fun Nevertheless, the source contains all the elements in a json string called root.App.main . This json is far too big to be explored here, but we can take a slice, for example:

    function getJSON() {
      var url = SpreadsheetApp.getActiveSheet().getRange("A1").getValue()
      var source = UrlFetchApp.fetch(url).getContentText()
      var data1 = source.match(/(?<="shortName":"Tesla, Inc.",).*/g)
      data2 = data1[0].substring(0,1200).match(/.*(?=,"uuid")/g)
      console.log("{"+data2[0]+"}")
    }
    

    The problem then is to explore the json as needed. I have performed a small test here https://docs.google.com/spreadsheets/d/1EfHtFr51cJdR6PK2E8cDtMAmUHJNQcOPeOs_6j0GZZc/edit?usp=sharing