Search code examples
pythonseleniumxpathfigure

How to extract <figure> images using Selenium in python?


I am trying to extract the image from above xpath from app store: https://apps.apple.com/us/app/mercer-marketplace-benefits/id1041417557

enter image description here

I tried the following code using the xpath:

driver.get('https://apps.apple.com/us/app/mercer-marketplace-benefits/id1041417557')
rating_distr = WebDriverWait(driver,30).until(EC.presence_of_element_located((By.XPATH, """(//*[@id="ember290"]/div/div[2])""")))
print(rating_distr.get_attribute('innerHTML'))

But the output is not an image:

    <figure class="we-star-bar-graph">
    <div class="we-star-bar-graph__row">
      <span class="we-star-bar-graph__stars we-star-bar-graph__stars--5"></span>
      <div class="we-star-bar-graph__bar">
        <div class="we-star-bar-graph__bar__foreground-bar" style="width: 76%;"></div>
      </div>
    </div>
    <div class="we-star-bar-graph__row">
      <span class="we-star-bar-graph__stars we-star-bar-graph__stars--4"></span>
      <div class="we-star-bar-graph__bar">
        <div class="we-star-bar-graph__bar__foreground-bar" style="width: 12%;"></div>

Is there any way to extract the output as an image? Thanks for the help!


Solution

  • As I suggested in my comment, I think a better/faster approach would be to just get the values instead of taking a screenshot. If you take a screenshot, someone will have to manually open it up and then record the values from the screenshot in some other format which is going to be a long and tedious process. Instead, just scrape the data from the page and dump it in the final desired format.

    For example, if you look at the HTML for just the 5-star rating bar

    <div class="we-star-bar-graph__row">
        <span class="we-star-bar-graph__stars we-star-bar-graph__stars--5"></span>
        <div class="we-star-bar-graph__bar">
            <div class="we-star-bar-graph__bar__foreground-bar" style="width: 76%;"></div>
        </div>
    </div>
    

    You can see that there's a class applied, we-star-bar-graph__stars--5, that indicates what star rating it is. You can also see that the width of the bar is set, style="width: 76%;", so that tells you the % of 5-star ratings. With that info, we can scrape the rating for each star.

    ratings = driver.find_elements_by_css_selector("figure.we-star-bar-graph div.we-star-bar-graph__bar__foreground-bar")
    # get the width of the entire bar
    width = float(driver.find_elements_by_css_selector(".we-star-bar-graph__bar").value_of_css_property("width"))[:-2])
    for i in range(len(ratings), 0, -1) :
        # get the width of the rating
        rating = float(ratings[len(ratings) - i].value_of_css_property("width")[:-2])
        print(str(i) + "-star rating: " + str(rating / width * 100) + "%")
    

    This should dump values like

    5-star rating: 76%
    4-star rating: 12%
    3-star rating: 4%
    2-star rating: 1%
    1-star rating: 6%
    

    That might not be your final desired format but it should get you pointed in the right direction.