Trying to extract specific string from a varchar column in SQL Server. And then from that string I need the last 4 digits only.
I tried substring
& patindex
but can't figure out how to get the last 4 digits.
This was one attempt but it only returns the first part of the text:
SUBSTRING([column], (PATINDEX('%ID0%-[0-9][0-9][0-9][0-9][0-9]%',[column])),9)
Here's some sample data:
Brian Larry, Hays Hill XX, ID007370, Option code - 0123 4567, AAA - XX
ID007366 - Dave Jones - XX - Option Code 0121 9999
ID00 7120 Brian Smith XX Branded company
ID07113 Gary Barnes - LLL 0123 9111 AAA LLL
ID00 7120 Charles Old XX Recall operation
ID0007439 - Kerry Hill - Maidstone XX - Option Code 0124 234 BBB XX
ID006817 Paul George Jackson 1234 8464 AAA Recall operation
I need to extract the ID00000
number, but then only return the last 4 digits. So for ID007370
, I only want to return 7370
. The ID numbers can vary in length.
Something like this looks to be working:
SELECT RIGHT(SUBSTRING(string_nonspace, y.start, PATINDEX('%[^0-9]%', STUFF(string_nonspace, 1, start + 1, '')) + 1), 4)
FROM (
VALUES (N'Brian Larry, Hays Hill XX, ID007370, Option code - 0123 4567, AAA - XX')
, (N'ID007366 - Dave Jones - XX - Option Code 0121 9999')
, (N'ID00 7120 Brian Smith XX Branded company')
, (N'ID07113 Gary Barnes - LLL 0123 9111 AAA LLL')
, (N'ID00 7120 Charles Old XX Recall operation')
, (N'ID0007439 - Kerry Hill - Maidstone XX - Option Code 0124 234 BBB XX')
, (N'ID006817 Paul George Jackson 1234 8464 AAA Recall operation')
) t (col1)
CROSS APPLY (
SELECT replace(col1, ' ', '') AS string_nonspace
) x
CROSS APPLY (
SELECT PATINDEX('%ID[0-9]%', string_nonspace) AS start
) y
Output:
ID | string_nonspace |
---|---|
7370 | BrianLarry,HaysHillXX,ID007370,Optioncode-01234567,AAA-XX |
7366 | ID007366-DaveJones-XX-OptionCode01219999 |
7120 | ID007120BrianSmithXXBrandedcompany |
7113 | ID07113GaryBarnes-LLL01239111AAALLL |
7120 | ID007120CharlesOldXXRecalloperation |
7439 | ID0007439-KerryHill-MaidstoneXX-OptionCode0124234BBBXX |
6817 | ID006817PaulGeorgeJackson12348464AAARecalloperation |
I remove spaces and then find starting position by PATINDEX('%ID[0-9]%'
. Then i clip the string so ID-part is removed, and then search for first non-number value. When that done, you can do a SUBSTRING between start and the first non-number value. Finally, RIGHT(..., 4)
gets the last four characters.
Of course this code will fail on many malformed strings, but them the breaks, the in-data kinda "blowers"