Search code examples
google-sheetsweb-scrapingxpathxml-parsinggoogle-sheets-formula

I'm trying to parse data from a webpage using the importXML function but the data doesn't have a class


I'd like to preface with I am a complete newbie to coding of all kinds. I usually spend 10x more time using google to try to automate something than it would have taken me to do it manually. I found a Google Sheet online that looks for the price of a MTG trading card from a specific web page.

This is the working function I am using in the google sheet.
=Index(IMPORTXML('Price Links Sale'!A13, "//span[@class='price price--withoutTax']"),1)

Where 'Price Links Sale' is the sheet that contains all of the URL's for each card I'm looking up.
Example: https://www.facetofacegames.com/assassins-trophy-guilds-of-ravnica/ This pulls the price for a non-foil near mint card. Perfect. Works great.

I wanted to add another column on my google sheet that checks the quantity in stock.

What I tried was

=Index(IMPORTXML('Price Links Sale'!A13, "//span[@class='data-product-stock']"),1)

What I realized is that span in this case doesn't have a class. I tried this as well as a few other things with no success. One of them being
=Index(IMPORTXML('Price Links Sale'!A13, "//span[not@class='data-product-stock']"),1)

I have no clue what I am doing, I'm just giving it the ole' college try. Is this something incredible simple?

Each web page using the radio buttons to check the price for foil, non-foil, near mint condition and played condition. Is it possible to use excel functions to use the radio buttons to grab those prices as well if I tell the function I'm looking for foil prices?


Solution

  • I believe your goal as follows.

    • You want to retrieve the value 0 from <span data-product-stock>0</span> using IMPORTXML.
    • The URL you want to use is https://www.facetofacegames.com/assassins-trophy-guilds-of-ravnica/.

    In this case, I think that data-product-stock is the attribute which has no value. So how about the xpath of //span[@data-product-stock]? The formula is as follows.

    Sample formula:

    =IMPORTXML(A1,"//span[@data-product-stock]")
    
    • The URL is put to the cell "A1".

    Result:

    enter image description here

    Reference:

    Added:

    Pattern 1:

    For example, when you want to retrieve the values of 0 from <span data-product-stock>0</span> and <div class="card-stock"><span>0</span> In Stock</div>, you can also use the xpath of //div[contains(@class,'stock')]/span[1].

    Sample formula:
    =IMPORTXML(A1,"//div[contains(@class,'stock')]/span[1]")
    
    Result:

    enter image description here

    • In this case, 10 items are retrieved.

    Pattern 2:

    For example, when you want to retrieve 18 of 18 in stock, how about the following sample formula?

    Sample formula:
    =JOIN("",IFERROR(ARRAYFORMULA(REGEXEXTRACT(IMPORTXML(A1,"//script"),"stock"":(\d+),")),""))
    
    • In this case, it seems that the value cannot be directly retrieved. So in this sample formula, the value is retrieved from the Javascript.
    Result:

    enter image description here