Search code examples
t-sqlsql-server-2008-r2substringpatindex

Select query to remove non-numeric characters


I've got dirty data in a column with variable alpha length. I just want to strip out anything that is not 0-9.

I do not want to run a function or proc. I have a script that is similar that just grabs the numeric value after text, it looks like this:

Update TableName
set ColumntoUpdate=cast(replace(Columnofdirtydata,'Alpha #','') as int)
where Columnofdirtydata like 'Alpha #%'
And ColumntoUpdate is Null

I thought it would work pretty good until I found that some of the data fields I thought would just be in the format Alpha # 12345789 are not.

Examples of data that needs to be stripped

AB ABCDE # 123
ABCDE# 123
AB: ABC# 123

I just want the 123. It is true that all data fields do have the # prior to the number.

I tried substring and PatIndex, but I'm not quite getting the syntax correct or something. Anyone have any advice on the best way to address this?


Solution

  • See this blog post on extracting numbers from strings in SQL Server. Below is a sample using a string in your example:

    DECLARE @textval NVARCHAR(30)
    SET @textval = 'AB ABCDE # 123'
    
    SELECT LEFT(SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000),
               PATINDEX('%[^0-9.-]%', SUBSTRING(@textval, PATINDEX('%[0-9.-]%', @textval), 8000) + 'X') -1)