Search code examples
web-scrapinggoogle-sheetsxpathgoogle-sheets-formulagoogle-query-language

Xpath Scraping Developer Website from Google Play


I'm new to xpath and scraping pages. I need to extract a link to the developer website from google play app page (Developer -> Visit Website) by using importxml function in google sheets. Tried several approaches, didn't work:

  1. Started with //main importxml(link; "//main/c-wiz[3]/div[1]/div[2]/div//div[9]/div/span/div/span/div/@href") Full xpath from Developer Console
  2. importxml(link; "//div[4]/c-wiz/div/div[2]/div/div/main/c-wiz[3]/div[1]/div[2]/div/div[9]/span/div/span/div[1]/a/@href")

Before scraping google play page, I had similar task for AppStore and came up with following formula that didn't work on Google Play: importxml(link; "//section[contains(@class,'section--link-list')]/ul/li[1]/a/@href")

For me the main issue now that the path to the website link is correct in the first two cases, but I cannot get any link at all. Can you please advice me how to scrape it correctly?

Thank you in advance!


Solution

  • try:

    =REGEXEXTRACT(QUERY(FLATTEN(IMPORTDATA(A1)), 
     "where Col1 starts with 'url:' 
        and Col1 ends with '}'", 0), """(.*)""")
    

    enter image description here

    enter image description here