Search code examples
sql-servercharindex

Split string by using CHARINDEX() in RIGHT() or SUBSTRING() return incorrect result


I want to split a column into two. I want to select values from where cell value has '(' So here is my requirement: Input Strings:

col: mystr
----------
123(0)
233 (123)
23 (A)
2 (122)

Required Output:

Output
-------
(0)
(123)
(A)
(122)

I have done following:

 SELECT right(mystr,LEN(mystr)-
 CASE WHEN CHARINDEX('(',mystr)=0 THEN LEN(mystr) 
 ELSE CHARINDEX('(',mystr) END 
 + 1) 
 FROM docs

How it works: I want to select index where I found first '(' and then select values next to it. As CHARINDEX() work from Left to right. So instead of:

select right(mystr,CHARINDEX('(',mystr))

I subtracted index from total length LEN(mystr)-CHARINDEX('(',mystr).

Here I found a scenario when '(' was not found and 'CHARINDEX()' returned 0 So in case '(' was not found I made the whole term 0 by:

CASE WHEN CHARINDEX('(',mystr)=0 THEN LEN(mystr) 
ELSE CHARINDEX('(',mystr) END

Here first element is not selected so I added +1 to whole term but it results in an extra value:

mystr Out without +1,  Out with +1, Out with +1 moved inside else;  desired 
-----   ------------   -----------    -----------------             -------
 112        ''             2              ''                          ''
 1(0)       0)             (0)             )                          (0)
 1 (12)     12)            (12)            )                          (12)

I have also tried with substring() but it has same issue:

SELECT substring(mystr,
CASE WHEN CHARINDEX('(',mystr)=0 THEN LEN(mystr) 
ELSE CHARINDEX('(',mystr) END,
LEN(mystr)-CASE WHEN CHARINDEX('(',mystr)=0 THEN LEN(mystr) 
ELSE CHARINDEX('(',mystr)END +1) FROM docs

Solution

  • If open to a Table-Valued Functions, consider the following:

    Tired of extacting and parsing strings (left(), right(), charindex(), ...), I modified a parse function to accept two non-like delimiters.

    Example

    Declare @YourTable table (mystr varchar(50))
    Insert Into @YourTable values
    ('122'),
    ('123(0)'),
    ('233 (123)'),
    ('23 (A)'),
    ('2 (122)')
    
    Select A.*
          ,NewVal = IsNull('('+B.RetVal+')','')   -- Adding back the ()'s
     From  @YourTable A
     Outer Apply [dbo].[tvf-Str-Extract](A.mystr,'(',')') B
    

    Returns

    mystr       NewVal
    122 
    123(0)      (0)
    233 (123)   (123)
    23 (A)      (A)
    2 (122)     (122)
    

    The UDF if Interested

    CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
    Returns Table 
    As
    Return (  
    
    with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
           cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
           cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
           cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)
    
    Select RetSeq = Row_Number() over (Order By N)
          ,RetPos = N
          ,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1) 
     From  (
            Select *,RetVal = Substring(@String, N, L) 
             From  cte4
           ) A
     Where charindex(@Delimiter2,RetVal)>1
    
    )
    /*
    Max Length of String 1MM characters
    
    Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
    Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
    */