Search code examples
regeximageweb-scrapingxpathgoogle-sheets-formula

Extracting with Google Sheets importxml the xpath style attribute value


Using Google Sheets importXML, I want to extract an xpath attribute value (the image URL) from:

<div class="lazy" style="display: block; background-image: url(&quot;https://www.etebg.net/UserFiles/pictures/E28BA835-34EC-C291-F627-856FE8B6DF90.jpg?cache&amp;block&amp;q=100&amp;w=350&amp;h=350&quot;);"></div> ```

From the site:
https://www.etebg.net/ro/search/p1/C%C4%83utare.html?q=%25%25

Trying /html/body/div[2]/div[2]/div/a/div[1]/div/@class it works to get the first value of class
but trying /html/body/div[2]/div[2]/div/a/div[1]/div/@style gets N/A


Solution

  • try:

    =ARRAYFORMULA(REGEXEXTRACT(QUERY(FLATTEN(IMPORTDATA(
     "https://www.etebg.net/ro/search/p1/C%C4%83utare.html?q=%25%25")), 
     "where Col1 contains 'data-src'"), "(https.*jpg)"))
    

    enter image description here

    or directly images:

    =ARRAYFORMULA(IMAGE(REGEXEXTRACT(QUERY(FLATTEN(IMPORTDATA(
     "https://www.etebg.net/ro/search/p1/C%C4%83utare.html?q=%25%25")), 
     "where Col1 contains 'data-src'"), "(https.*jpg)")))
    

    enter image description here