Search code examples
xmlgoogle-sheetsweb-scrapingxpathgoogle-sheets-formula

Trying to extract image URLs from XML using IMPORTXML in Google Sheet


I have been working on a IMPORTXML formula in Google Sheets to export image URLS from an XML file for specific records. I have not had much luck with the XPath I have been trying due to the hierarchy of the records.

Here is the URL for the XML file: http://www.roguerealestate.com/listings.xml

I have been able to extract all IDValues using the following formula:

=IMPORTXML("http://www.roguerealestate.com/listings.xml","//Property/Identification/IDValue")

I now want to be able to select an IDValue from a list and have it return all Src tags for that IDValue

I was able to get all Src tags using this formula, just need help parsing it down to only show URLs for a specific IDValue:

=IMPORTXML("http://www.roguerealestate.com/listings.xml","//Property/Floorplan/File/Src")

Any help would be much appreciated, I am just learning Xpath.


Solution

  • The following query will select all 'Src' elements from the 'Property' elements that have the '/Identification/IDValue' field set to 110212 or 110151:

    //Property[./Identification/IDValue[text()=110212 or text()=110151]]/Floorplan/File/Src
    

    Please verify on http://xpather.com/1ZgDP5Wz if this works as expected (the provided XML file was shortened and pasted). You can click on the results on the right to scroll to the relevant parts of the source document.