Search code examples
google-sheetsgoogle-sheets-formula

Scraping hockey data with IMPORTHTML in Google Sheets


Please suggest a formula that would help to output a list of player statistics from all 69 pages on the KHL website, not just the first page. Maybe it is a case for IMPORTXML usage. Link to the site and the current IMPORTHTML formula inside the file: https://docs.google.com/spreadsheets/d/1KSVO3jBqAX3CressOeIDu8Hhub8eyYDrwv4iF_17HL4/edit#gid=0

=IMPORTHTML(A1,"table",1)

Solution

  • Since you'll encounter the error Resource at url contents exceeded maximum size. with the amount of data you'd like to import from this URL, you can get the data for each table instead and use IMPORTHTML or IMPORTXML like this:

    image

    To get the link structure that'll work for the formula, like this URL for the second table in the website you shared, you may do the following:

    1. Right-click the page number of the data you'd like to import > Copy link address.

    2. Paste the link to the Google Sheet that you have, and then use IMPORTHTML or IMPORTXML. Note: If you encounter an error with the formula, remove &section=skaters, which worked for me.