Search code examples
sqlsql-serversql-server-2016

Having issues with removing preceding zeroes and characters after special characters from a varchar column


I am trying to remove alphabets, space from left, right and in between, preceding zeros, and all the characters after special characters(comma,semicolon, backslash, forward slash)

I have a column that contains value like the below :

051 765 999
MK00564123
KJ786453425/9
432563542/3
096745632
53452BMG441,8

I am trying to remove alphabets and space from left and right from the value for those values containing alphabets in the value and also preceding 0.

Expected Output:

51765999
564123
96745632

Also, I am trying to remove everything after a special character in the column(comma,semicolon, backslash, forward slash):

Expected Output:

786453425
432563542
53452441

Final Output:

51765999
564123
96745632
786453425
432563542
53452441

I have created the fiddle but not getting the expected output as shown below:

enter image description here

Fiddle: http://sqlfiddle.com/#!18/0b383/1123

Can someone please help me to identify what is the issue and how can I get the expected output?


Solution

  • One way of doing this is

    CREATE FUNCTION dbo.udf_GetCleanedAlphaNumeric (@alphanumeric VARCHAR(100))
    RETURNS TABLE
    AS
        RETURN
          (SELECT SUBSTRING(cleaned, PATINDEX('%[^0]%', cleaned + 'x'), 8000) AS Final
           FROM   (VALUES (SUBSTRING(@alphanumeric, 1, PATINDEX('%[,;\/]%', @alphanumeric + ';') - 1))) ca1(prefix)
                  CROSS APPLY (VALUES ( REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                        REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                        REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                        REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                        REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                        REPLACE(REPLACE(prefix, ' ', ''), 'a', '')
                                        , 'b', ''), 'c', ''), 'd', ''), 'e', ''), 'f', '')
                                        , 'g', ''), 'h', ''), 'i', ''), 'j', ''), 'k', '')
                                        , 'l', ''), 'm', ''), 'n', ''), 'o', ''), 'p', '')
                                        , 'q', ''), 'r', ''), 's', ''), 't', ''), 'u', '')
                                        , 'v', ''), 'w', ''), 'x', ''), 'y', ''), 'z', '') )) ca2(cleaned)) 
    

    and then

    SELECT alphanumeric,
           Final
    FROM   temp
    CROSS APPLY dbo.udf_GetCleanedAlphaNumeric(alphanumeric)
    
    • First extract the substring before any ,;\/ (if present)
    • Then strip out any letters or spaces (can be made much more concise on later versions with TRANSLATE function as in edit history)
    • Then remove leading zeroes

    (DB Fiddle)