Search code examples
excelfilterexcel-formulaconcatenationsubstitution

Remove special characters and numbers in a number string


I would like to cleanup some string that contains special characters and numbers.

I tried some formula in achieving this with a string that contains letters and numbers:

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(C2,"0",""),"1",""),"2",""),"3",""),"4",""),"5",""),"6",""),"7",""),"8",""),"9",""),";#;#",", "),";#","")

Example 1

However, my main problem here is that I would like to clean up a string which contains numbers as shown below:

Example 2

Is there anyway I can achieve the above screenshot by tweaking the formula I used for the first example? As the formula in the first example would only delete all the characters in the string.


Solution

  • Interesting challenge, and it might be done smarter/shorter, but I wanted to give it a try using formulas only (another option I could think of is using PowerQuery):


    So bascially in B2 put:

    =TEXTJOIN(", ",TRUE,TEXT(FILTERXML("<t><s>"&SUBSTITUTE(A2,";#","</s><s>")&"</s></t>","//s[string-length(text())>3]"),"0000"))
    

    Results:

    enter image description here


    Note: It's an array formula and needs to be confirmed through CtrlShiftEnter

    And untill MS introduces some sort of SPLIT formula to the DA-formula family I think FILTERXML is going to be your best bet.