Search code examples
xmlweb-scrapinggoogle-sheets-query

Using importXML with a php populated website and scraping to googlesheets


I am trying to import data from this website using IMPORTXML into a googlesheet http://14.139.247.11/citywx/city_weather.php?id=42488

I want to scrape the data at Minimum Temp (oC)

Loading data from this table to sheets using IMPORTHTML works just fine using this =IMPORTHTML("http://14.139.247.11/citywx/city_weather.php?id=42488","table",2)

But trying to scrape using IMPORTXML is not working. Using chrome developer tools I copied the XML path and it shows to be

/html/body/center/font/table[1]/tbody/tr[1]/td[2]/table/tbody/tr[4]/td[1]/font

This returns:

NA ie.("Imported Content is Empty")

I copied the entire html on my server. The scraping works locally when I remove the bolded "font" from /html/body/center/font/table[1]/tbody/tr[1]/td[2]/table/tbody/tr[4]/td[1]/font and set it to: /html/body/center/table[1]/tbody/tr[1]/td[2]/table/tbody/tr[4]/td[1]/font

However it still fails on the original site.

It does not look like a case of dynamic website since importHTML works and I could not find any javascript running. What am I missing here?


Solution

  • You can use Index to get the row and column position for the min temp from the imported table using IMPORTHTML not IMPORTXML. The source is HTML not XML.

    =INDEX(IMPORTHTML("http://14.139.247.11/citywx/city_weather.php?id=42488","table",2),4,2)
    

    Sheet:

    Result