Search code examples
google-sheetsweb-scrapingimportgoogle-sheets-formulawikipedia

Google Spreadsheet - How to import specific table from Wikipedia?


I'm currently using this to import a table from Wikipedia:

=IMPORTHTML("https://en.wikipedia.org/wiki/2020%E2%80%9321_Premier_League";"table";6)

It takes the 6th table from that Wikipedia page, which is the "results table" for the Premier League (football). It nicely imports the data in a 21 row x 21 column matrix.

But from time to time, people add new tables to that Wikipedia page, changing the order of the tables. The target table no longer becomes the "6th table".

Question: Is there a different way to directly import the "results table" instead of the taking the table number? (I tried fooling around with the function importxml but I don't know how to use XPath.)


Solution

  • try:

    =IMPORTXML("https://en.wikipedia.org/wiki/2020%E2%80%9321_Premier_League",
     "//table[@class='wikitable plainrowheaders']/tbody/tr")
    

    enter image description here