Cleaning data by adding/removing characters to a string when it meets certain conditions T-SQL

I'm looking for help with cleaning a column in my data set so that I can join to another table.

The first data set is my complete data and includes what we call "reference_numbers" which relate to a specific case. Here is a dummy sample:

reference_number	case_opened	case_closed
01353568-00000001	11/01/2021	03/02/2022
09736473-00000009	21/04/2005	19/07/2021
05839576-00000012	13/09/2014	19/12/2017
09364857-00000006	13/09/2014	19/12/2017

As you can see, the "reference_number" is 8 digits then hyphen (-) and then another 8 digits. This is how a reference number should look.

My second data set is full of the same "reference_numbers". However, there is inconsistencies in the character length as they are often written differently by individuals:

reference_number	Case_workers
1353568-00000001	5
09736473-9	10
5839576-12	7
09364857-000000006	4

The first reference_number in the second data set is missing the first "0"
The second reference_number in the second data set is missing seven "0" after the hyphen
The third reference_number in the second data set is missing both the first "0" and six "0" after the hyphen
The fourth reference_number in the second data set has too many digits after the hyphen (there is supposed to be seven 0's)

I want to be able to join the first data set onto the second data set using the reference_number. However, I need to clean them first. Is this possible and is there any efficient way of doing this?

Thanks

Solution

If the rules are so specific, you could try to use a combination of STRING_SPLIT and STRING_AGG:

SELECT 
    t.reference_number,
    STRING_AGG(RIGHT('00000000'+s.value,8),'-') new_reference_number
FROM dbo.SecondTable t
CROSS APPLY STRING_SPLIT(t.reference_number,'-') s
GROUP BY t.reference_number
;

Using the sample data you posted, the results are:

╔════════════════════╦══════════════════════╗
║  reference_number  ║ new_reference_number ║
╠════════════════════╬══════════════════════╣
║ 09364857-000000006 ║ 09364857-00000006    ║
║ 09736473-9         ║ 09736473-00000009    ║
║ 1353568-00000001   ║ 01353568-00000001    ║
║ 5839576-12         ║ 05839576-00000012    ║
╚════════════════════╩══════════════════════╝