I'm curious on the data I get from someone. Most of the time I need to get 3 integers then a space then eight integers.
And The integration created a column varchar(20) ... Don't doubt it works, but that gives me some matching errors.
Because of this, I'd like to know what is the data type of the characters on each row.
For exemple : 0 is for integer, s for space, a for char, * for specific
AWB | data type
---------------------------------
012 12345678 | 000s00000000
9/5 ab0534 | 0*0saa0000
I'd like to know if there is a function or a formula to get this kind of results. Right after I'll be able to group by this column and finally be able to check how good is the data quality.
I don't know if there is a specific word for what I tried to explain, so excuse me if this is a duplicate of a post, I didn't find it.
Thank you for your feedback.
There's nothing built-in, but you might use an approach like this:
DECLARE @tbl TABLE(ID INT IDENTITY,AWB VARCHAR(100));
INSERT INTO @tbl VALUES
('012 12345678')
,('9/5 ab0534');
WITH cte AS
(
SELECT t.ID
,t.AWB
,A.Nmbr
,C.YourMask
FROM @tbl t
CROSS APPLY (SELECT TOP (DATALENGTH(t.AWB)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Nmbr)
CROSS APPLY (SELECT SUBSTRING(t.AWB,A.Nmbr,1)) B(SingleCharacter)
CROSS APPLY (SELECT CASE WHEN B.SingleCharacter LIKE '[0-9]' THEN '0'
WHEN B.SingleCharacter LIKE '[a-z]' THEN 'a'
WHEN B.SingleCharacter = ' ' THEN 's'
ELSE '*' END) C(YourMask)
)
SELECT ID
,AWB
,(
SELECT YourMask
FROM cte cte2
WHERE cte2.ID=cte.ID
ORDER BY cte2.Nmbr
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)') YourMaskConcatenated
FROM cte
GROUP BY ID,AWB;
The idea in short:
The cte
will create a derived set of your table.
The first CROSS APPLY
will create a list of numbers as long as the current AWB
value.
The second CROSS APPLY
will read each character separately.
The third CROSS APPLY
will finally use some rather simple logic to translate your values to the mask you expect.
The final SELECT
will then use GROUP BY
and a correlated sub-query with FOR XML
to get the mask characters re-concatenated (With version v2017+ this would be easier calling STRING_AGG()
).