Search code examples
javascriptapplescript

Use AppleScript to fetch the price of an item in the currently open Chrome page on Amazon.co.uk


Overview

So I can use Applescript to get the Title of a page using:

tell application "Google Chrome" to return title of active tab of front window

If I have, for example, the page https://www.amazon.co.uk/Agua-Brava-Men-EDC-Splash/dp/B000E7YK0U/ open, I can see the current price is £22.86.

screenshot of amazon.co.uk page of a random item with price highlighted

Can I use Applescript (or Javascript) to somehow fetch this value? (It can be the approximate price, it doesn't need to be exact. So, £22 or 22 would be acceptable)

View Source on the Amazon.co.uk page

When I view source for that page I find:

<span class="a-price a-text-price a-size-medium apexPriceToPay" data-a-size="b" data-a-color="price"><span class="a-offscreen">£22.86</span><span aria-hidden="true">£22.86</span></span>

view source for the random amazon-uk item

How could I use AppleScript or Javascript to get £22.86 for me?


Solution

  • (Note that in this example, I’m using a different product, because Agua Brava is apparently no longer available as I write this.)

    Your example is on the right track. However, JavaScript uses zero-based indexing, not one-based indexing.

    I verified that this produces a “missing value”:

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                execute javascript "document.getElementsByClassName('a-price a-text-price a-size-medium apexPriceToPay')[1].innerHTML"
            end tell
        end tell
    end tell
    

    I then replaced the 1 index with a 0:

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                execute javascript "document.getElementsByClassName('a-price a-text-price a-size-medium apexPriceToPay')[0].innerHTML"
            end tell
        end tell
    end tell
    

    And it returned the HTML code contained by that span:

    <span class="a-offscreen">£19.35</span><span aria-hidden="true">£19.35</span>
    

    To get the actual price, you would need, then, to get the innerHTML of one of the child elements of that span. Something like:

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                set productPrice to execute javascript "document.getElementsByClassName('a-price a-text-price a-size-medium apexPriceToPay')[0].children[0].innerHTML"
            end tell
        end tell
    end tell
    productPrice
    

    This produces £19.35, which satisfies as a possible desired result in your question. If you want to do math on it, however, you would want to remove the £. This is most easily done by removing the first character.

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                set productPrice to execute javascript "document.getElementsByClassName('a-price a-text-price a-size-medium apexPriceToPay')[0].children[0].innerHTML"
            end tell
        end tell
    end tell
    set productPrice to the rest of the characters of productPrice as string
    

    Note that while this is setting productPrice as a string—which is necessary for gluing the characters back together because the rest of the characters of string—or, more generally, characters x thru y of string—produces a list of individual characters—AppleScript is not strongly typed. You can do math on strings if they are easily convertible to numbers, as the string 19.35 (in this case) is. For example, if you need it to be rounded, you could use:

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                set productPrice to execute javascript "document.getElementsByClassName('a-price a-text-price a-size-medium apexPriceToPay')[0].children[0].innerHTML"
            end tell
        end tell
    end tell
    set productPrice to the rest of the characters of productPrice as string
    set productPrice to round (productPrice)
    

    This produces the result 19, successfully rounding the string 19.35.

    There are other ways you might get at the text £19.35. The span that actually contains the text has the apparently unique class a-offscreen, making possible the shorter command:

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                set productPrice to execute javascript "document.getElementsByClassName('a-offscreen')[0].innerHTML"
            end tell
        end tell
    end tell
    

    Or, it might (or might not…) be more reliable to get the nearest parent tag that has an id, which in this case appears to be corePrice_feature_div and then drill down from that:

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                set productPrice to execute javascript "document.getElementById('corePrice_feature_div').children[0].children[0].children[0].innerHTML"
            end tell
        end tell
    end tell
    

    Another way of focusing on a specific part of a page is to combine getting the nearest parent with a specific id with getting all of the children of that tag that match a class name. You can use getElementsByClassName on any element, not just on document, and it will only get child elements of that parent.

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                set productPrice to execute javascript "document.getElementById('corePrice_desktop').getElementsByClassName('a-offscreen')[0].innerHTML"
            end tell
        end tell
    end tell
    

    (It appears that the unique id surrounding the price area has changed from corePrice_feature_div to corePrice_desktop over the last few days, which ties into my closing advice.)

    You might even eschew getting text via searching for tags, and just run a regular expression over the entire body of the page:

    tell application "Google Chrome"
        tell window 1
            tell tab 1
                set productPrice to execute javascript "document.body.innerHTML.match(/(£[1-9][0-9]\\.[0-9][0-9])[^0-9]/)[1]"
            end tell
        end tell
    end tell
    

    This, also, returns £19.35, because “£19.35” is the first text that begins with a “£” immediately followed by a number from 1-9 that is immediately followed by 0-9, that is immediately followed by a period, that is immediately followed by exactly two numbers 0-9. Because this is a regular expression, you have a lot of flexibility in how you choose what to search for and what to avoid.

    The regex match uses index 1 rather than index 0 because index 0 is the full match including whatever non-numeric character follows the price; index 1 is the first (and, in this case, only) parenthetical match.

    All of these methods have the problem that they will fail when Amazon changes class names, or changes the layout of the page such that index zero is no longer the correct result, or starts adding more prices in arbitrary locations. Whether that’s a problem will depend on how often that happens, which you’ll find out once you start using your script regularly.

    It probably isn’t worth worrying about ahead of time unless this is a critical app. Once you see how the page tends to change over time, you may find that one of the above solutions is better than the others, or that yet another solution will be more appropriate.