Search code examples
google-sheetsxpathconcatenationflattengoogle-query-language

Don't expand on results when using Google Sheets' importXML


Google Sheets has a cool function called importxml. It lets you scrape data from public web sources e.g. a URL's title.

You can write an XPath expression e.g. //title/text() and it will find all occurrences of some value and display those in your Google Sheet as separate rows. Nifty!

If the contents of <element></element> contain HTML markup e.g. <element>some<b>data</b>is<em>here</em>. then Google Sheets gives you two options:

  1. if you append /text() at the end of your XPath then the result is the direct text value of the element e.g. some here
  2. if you do not append /text() at the end of your XPath then the result is the entire text split across multiple cells in the row. You end up with |some|data|is|here|. (where | acts as my cell divider)

I am trying to get some data is here in a single cell, not across multiple cells on my row. Is there a way to do that?

Using textjoin or concatenate on importxml doesn't work because it then applies it to the entire result and if there are multiple instances of <element>some<b>data</b>is<em>here</em>. then I get them all in a single cell instead of one result per row (which is what I want).


Solution

  • try:

    =TEXTJOIN(" "; 1; your_importxml_here)
    

    for array use:

    =FLATTEN(QUERY(TRANSPOSE(your_importxml_here);;9^9))