I am using AWS Athena, so functions are a bit limiting. But essentially I want to extract the first 5 consecutive and sequential numbers from a alphanumeric field.
From the first example, you can see it ignores the first 1 because there aren't 4 trailing numbers. I want to find and extract the first 5 numbers that are given together from this field. The output field is what I am hoping to achieve.
This will find an exact sequence of 5 digits.
a sequence of less or more than 5 digits will be ignored.
^|\D = Indication for the start of the text OR a non-digit character
\d{5} = 5 digits
\D|$ = A non-digit character OR indication for the end of the text
with t (Example) as (values ('Ex/l/10345/Pl'), ('Ex/23453PlWL'), ('ID09456//'))
select Example, regexp_extract(Example, '(^|\D)(\d{5})(\D|$)', 2) as Output
from t
+---------------+--------+
| Example | Output |
+---------------+--------+
| Ex/l/10345/Pl | 10345 |
| Ex/23453PlWL | 23453 |
| ID09456// | 09456 |
+---------------+--------+