Search code examples
excelexcel-formulaformula

FilterXML Formula Comes Empty for some of Data


I have a question, hopefully you may be able to help. I have been using MS Excel 2016 where i have number of data like in attached below file.

Where FILTERXML works for some cells and pulls the data but it also does not work for some cells which has similar data and returns empty.

I tried and clean the cell data while seeing to other cells data, which result is appearing with formula but it was not still working.

Your help will be greatly appreciated.

=IFERROR(FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE($A2,B$1&":","<r/>"),CHAR(10),"</b><b>")&"</b></a>","//b[r]["&COUNTIF($B$1:B$1,B$1)&"]"),"")

https://drive.google.com/file/d/1Xv4BUh8sObhVirqYK2fsLfwsydvrs4rv/view?usp=sharing


Solution

  • Your formula uses XML filtering using XPATH. But for XML there are some characters having special meaning and must be escaped. For example your text Pharmacy: 1ST AID PHARMACY & SURGICAL SUPPLIE would must be Pharmacy: 1ST AID PHARMACY &amp; SURGICAL SUPPLIE in XML. This is because & marks the start of an entity in XML and needs to be escaped if it not has that meaning.

    So you would must substtute all & with &amp;.

    =IFERROR(FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($A2,"&","&amp;"),B$1&":","<r/>"),CHAR(10),"</b><b>")&"</b></a>","//b[r]["&COUNTIF($B$1:B$1,B$1)&"]"),"")
    

    But there are multiple other such characters. Character < for example would must be &lt;. This is because < marks the start of a tag in XML and needs to be escaped if it not has that meaning.

    =IFERROR(FILTERXML("<a><b>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($A2,"<","&lt;"),"&","&amp;"),B$1&":","<r/>"),CHAR(10),"</b><b>")&"</b></a>","//b[r]["&COUNTIF($B$1:B$1,B$1)&"]"),"")
    

    Maybe there are others too, so the SUBSTITUTE chain gets longer and longer.