Search code examples
sqlsql-serversql-server-2016

Extract Only Specified Data in a string SQL Server


I have a review column in a table which has multiple strings formats like below,

Example strings:

Reviews_Column_Data [INPUT]
'05012:000000:  :0:00000000|00647:000000:  :0:00000000|00283:000000:  :0:00000000|'
'05012:000000:  :0:00000000|00025:000000:  :0:00000000|00647:000000:  :0:00000000|'
'05012:000000:  :0:00000000|02095:000000:  :0:00000000|00647:000000:  :0:00000000|'
'05012:000000:  :0:00000000|00647:000000:  :0:00000000|' 
'05081:023931:DF:9:20230111|00604:023931:XX:9:20230111|02470:023931:XX:9:20230111|00655:023931:XX:9:20230111|00464:023931:XX:9:20230111|02130:023931:XX:9:20230111|'
'05081:023931:DF:9:20230131|02229:023931:XX:9:20230131|02130:023931:XX:9:20230131|00692:023931:XX:9:20230131|02170:023931:XX:9:20230131|05084:000000:  :0:00000000|00647:000000:  :0:00000000|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             '

I need to extract the review in Application table in the below format...

Application_Review_Column_Data [OUTPUT]

'05012,00647,00283'
'05012,00025,00647'
'05012,02095,00647'
'05012,00647', 
'05081,00604,02470,00655,00464,02130' 
'05081,02229,02130,00692,02170,05084,00647'                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             

Key Point to find the reviews is, starting point of the string and each end point of symbol "|" and Immediate ":"

I have tried with the below code, but it didn't work

 DROP TABLE IF EXISTS #Temp_Tbl
 Create table #Temp_Tbl (Comments varchar(500));

 INSERT INTO #Temp_Tbl
 VALUES('05012:000000:  :0:00000000|00647:000000:  :0:00000000|00283:000000:  :0:00000000|'),
 ('05012:000000:  :0:00000000|00025:000000:  :0:00000000|00647:000000:  :0:00000000|'),
 ('05012:000000:  :0:00000000|02095:000000:  :0:00000000|00647:000000:  :0:00000000|'),
 ('05081:023931:DF:9:20230131|02229:023931:XX:9:20230131|02130:023931:XX:9:20230131|00692:023931:XX:9:20230131|02170:023931:XX:9:20230131|')


Solution

  • As I have commented, use string_split to split on |. Then charindex() and left() to extract the required string. Finally string_agg() to concatenate it back

     select Comments, 
            string_agg(case when p > 0 then left(c.value, p - 1) end, ',')
     from   #Temp_Tbl t
            cross apply string_split(Comments, '|') c
            cross apply (select p = charindex(':', c.value) ) p
    group by Comments