I am trying to retrieve informations from a table in this link: https://ski-resort-stats.com/ski-resorts-in-europe/
The page has a scrolling menu, which I must act on first to have all the entries on the page and being able to select them on. But, when I am retrieving the infos I look for after, it does not do it for the whole table... I tried to add a sleeping time between the two actions in case it would be link to that but nothing changes. Could someone help me with that ? Here is my code below:
driver = webdriver.Chrome("path/chromedriver")
driver.get("https://ski-resort-stats.com/ski-resorts-in-europe/")
content = driver.page_source
soup = BeautifulSoup(content)
#Select "All" in the drop down menu to select all the ski resorts
menu=driver.find_element_by_id("table_1_length")
for option in menu.find_elements_by_tag_name('option'):
if option.text == 'All':
option.click()
break
import time
time.sleep(10)
mydivs = soup.find_all("td",{"class":"column-resort-name"})
print(mydivs)
So the last element printed of mydivs is not the last element of the table...
All data is already in the page in the <table>
:
import requests
from bs4 import BeautifulSoup
url = "https://ski-resort-stats.com/ski-resorts-in-europe/"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
# print some data from rows
for row in soup.select("#table_1 tbody tr"):
r = [td.get_text(strip=True) for td in row.select("td")]
print(r[1])
Prints:
Hemsedal
Geilosiden Geilo
Golm
Hafjell
Voss
Hochschwarzeck
Rossfeld - Berchtesgaden - Oberau
...
Puigmal
Kranzberg-Mittenwald
Wetterstein lifts-Wettersteinbahnen-– Ehrwald
Stuhleck-Spital am Semmering