Search code examples
google-sheetsxpath

Limit the importxml to a defined span


Currently I am using a transpose and then another column to count the results and give me what I want. But because Tanaike is awesome and helped me on another section, I am trying to wrap my head around what he did and apply it to this.

Starting with this URL in A1,

https://www.zillow.com/homedetails/307-N-Rosedale-Ave-Tulsa-OK-74127/22151896_zpid/

This is the formula in A2:

=If($A$1:A="","",Transpose(importxml($A1:$A,"//span[@class='snl phone']")))

Based on the listing sometimes there are three phone numbers, sometimes four, and sometimes eight that get spread across as many columns as needed.

I am looking for the Property Owner phone number. This is the ELEMENT from the inspection.

<div class="info flat-star-ratings sig-col" id="yui_3_18_1_2_1506365934526_2361">  <span class="snl name notranslate">Property Owner</span>           <span class="snl phone" id="yui_3_18_1_2_1506365934526_2360">(918) 740-1698 </span>    </div>

So I tried this, and it comes up content is empty. I was thinking to look at the div class info flat, then within that the snl phone, and stop before the /end of span.

=importXML(B17,"//div[@class='info flat-star-ratings sig-col']//span[@class='snl phone']/@span")

What I really need is ONLY the property owner phone number with 95% or greater accuracy.


Solution

  • How about this modification of XPath query?

    Modified XPath query :

    =importxml(A1,"//div[@class='info flat-star-ratings sig-col']//span[@class='snl phone']")
    

    Result :

    enter image description here

    If this is not data you want, I'm sorry.

    Edit :

    4th and 8th number are the same. Is my understanding correct? If it's no problem. Please put URL and a following formula to "A1" and "A2", respectively.

    =QUERY(ARRAYFORMULA(IF(IMPORTXML(A1,"//div[@class='info flat-star-ratings sig-col']//span[@class='snl name notranslate']")="Property Owner",IMPORTXML(A1,"//div[@class='info flat-star-ratings sig-col']//span[@class='snl phone']"), "")),"Select * where Col1<>''")
    

    Result :

    enter image description here