Search code examples
pythonpython-3.xselenium-webdriverwebdriverwait

Is there any way to refactor this selenium function to improve performance?


I am working on a discord bot that uses selenium to scrape a player's rank from Rocket League Tracker on command. The scrape_rank() function below is what I've come up with and it works (sometimes), but I seem to run into a lot of issues with it. When running from Replit it appears to max CPU and RAM and often fails (webdriver TimeoutException), but the same code running in a Google Cloud VM runs fine for the most part, but seems to fail with no exception after running for awhile (despite being remaining active with tmux).

The code feels a bit messy. The element in question is difficult to track down on the page. Sometimes the correct element text (rank) is scraped, but other times it scrapes player MMR instead, so I set up an if .isdigit() to catch when it scrapes MMR and then if True run a new find_element to (hopefully) scrape the correct info. My understanding is that the page is dynamic and I think the result of the find_element is dependent on the page load order, ie if the MMR element loads prior to the rank element then find_element is scraping the "wrong" info.

Maybe there's a way to rerun the same find_element command until the element comes back False for .isdigit()? Ultimately I simply want to scrape the player rank from a single known URL and return it.


def scrape_rank(pName):

  driver.get(f"https://rocketleague.tracker.network/rocket-league/profile/steam/{pName}/overview")
  three_v_three_rank = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(text(),'Standard 3v3')]/following-sibling::div")))

  if str(three_v_three_rank).isdigit():
    rank_3v3 = driver.find_element(By.XPATH, "//div[contains(text(),'Standard 3v3')]/following-sibling::div/following-sibling::div")
    return rank_3v3
    
  else:
    rank_3v3 = three_v_three_rank
    return rank_3v3

  driver.quit()
  rank_text = rank_3v3
  rank = rank_text.split('\n')[0]
  return rank

Solution

  • To improve performance I changed the way the element is being selected on the website.
    Now it finds the table where the element is supposed to be found with the text 'Standard 3v3' so it doesn't have to look through the entire page, and instead only looks inside of the table.

    wait = WebDriverWait(driver, 10)
    
    def scrape_rank(pName):
    
      driver.get(f"https://rocketleague.tracker.network/rocket-league/profile/steam/{pName}/overview")
    
      # Find table which contains all ranks from all gamemodes
      ranks_table = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.trn-table__container>table>tbody")))
    
      # Find 3v3 rank element inside of table
      rank_3v3 = ranks_table.find_element(By.XPATH, "//tr//td[@class='name']//div[contains(text(),'Standard 3v3')]/following-sibling::div"))