Search multiple occurrences of a word in a string

I have the below text in a column in a SQL Server table.

"55-9988, Version 1.0 dated 07/20/2009
3684 for 66-0022
IB from Microsoft , for Monovalent A/king Influenza Subvirion Vaccine, Version 1.0 dated 06/27/2009
Package Insert from Microsoft , for Fluzone, dated 06/2008
Package Insert from Microsoft , for H5N1, dated 04/2007
IB from google, for AS93 as an Adjuvant for use with a  king Vaccine, Version 1.0 dated 07/2009
Package Insert from Microsoft , for Fluzone, Version 37 dated 06/18/2008
55-9988 MIA, Version 1.0 dated 07/20/2009"

This is an example from one cell and I have multiple rows with similar text.

In this case, I need to obtain the pattern xx-xxxx (55-9988, 66-0022) from the above string and I need all occurrences.

I'm able to obtain the first instance of the occurrence, but I'm unable to get the remaining occurrences from the text.

Solution

Please try the following solution based on the STRING_SPLIT() function use.

Also, it is using an enhanced TRIM() function. So, it will work starting from SQL Server 2017 onwards.

LIKE '[0-9][0-9]-[0-9][0-9][0-9][0-9]%' clause is looking for tokens that are matching a particular pattern.

I added Method #2 XML based implementation to support SQL Server 2008.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (id INT IDENTITY PRIMARY KEY, tokens VARCHAR(1024));
INSERT INTO @tbl (tokens) VALUES
('55-9988, Version 1.0 dated 07/20/2009
3684 for 66-0022
IB from Microsoft , for Monovalent A/king Influenza Subvirion Vaccine, Version 1.0 dated 06/27/2009
Package Insert from Microsoft , for Fluzone, dated 06/2008
Package Insert from Microsoft , for H5N1, dated 04/2007
IB from google, for AS93 as an Adjuvant for use with a  king Vaccine, Version 1.0 dated 07/2009
Package Insert from Microsoft , for Fluzone, Version 37 dated 06/18/2008
55-9988 MIA, Version 1.0 dated 07/20/2009'),
('fafa');
-- DDL and sample data population, end

-- Method #1
-- SQL Server 2017 onwards
DECLARE @CrLf CHAR(2) = CHAR(13) + CHAR(10);

SELECT id, token
FROM @tbl
CROSS APPLY STRING_SPLIT(REPLACE(tokens, @CrLf,SPACE(1)), SPACE(1))
CROSS APPLY (SELECT TRIM(',' FROM value)) AS t(token)
WHERE token LIKE '[0-9][0-9]-[0-9][0-9][0-9][0-9]%';

-- Method #2
-- SQL Server 2008 onwards
DECLARE @separator CHAR(1) = SPACE(1);

SELECT id, REPLACE(token, ',', '') AS token
FROM @tbl
CROSS APPLY (SELECT CAST('<root><r><![CDATA[' + 
      REPLACE(REPLACE(tokens, @CrLf,@separator), @separator, ']]></r><r><![CDATA[') + 
      ']]></r></root>' AS XML)) AS t1(c)
CROSS APPLY c.nodes('/root/r/text()') AS t2(x)
CROSS APPLY (SELECT x.value('.', 'VARCHAR(50)')) AS t3(token)
WHERE token LIKE '[0-9][0-9]-[0-9][0-9][0-9][0-9]%';

Output

id	token
1	55-9988
1	66-0022
1	55-9988