I wonder if there is a way, just using a select statement, without involving loop/for/while control statements or creating any function in SQL Server, to split a string into n
segments of equal length m
?
For example, the string s looks like this:
s='AaaaaBbbbbCccccDdddd' with n=4 and m=5
n and m are known constants and would not change in the context
I would like the result looking like this:
Aaaaa
Bbbbb
Ccccc
Ddddd
Of course, this could be easily done by creating a function with a loop control.
I am just curious if it is possible to get the same results with just a select statement?
I would, personally, use a tally for this. An rCTE works, however, if you are working with larger amounts of data, or (much) longer strings, then the recursiveness of an rCTE will impact performance (I discuss this further here).
As you state you can't create a function, I use an inline tally instead (which admittedly makes things look more complicated), to put the string into relevant segments. If the string doesn't fit the split evenly, the last value will be shorter.
DECLARE @s varchar(MAX),
@m int;
SET @s = 'AaaaaBbbbbCccccDdddd';
SET @m = 5;
--Data set of NULLs
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
--Build the Tally
Tally AS(
SELECT TOP(LEN(@s)) --Limit to length of the string
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4, N N5, N N6, N N7) --Up to 10,000,000 rows
SELECT SUBSTRING(@s, T.I, @m) AS sp
FROM Tally T
WHERE (T.I - 1) % @m = 0
ORDER BY T.I;
If this is against a table, it may look something like this:
CREATE TABLE dbo.YourTable (ArbitraryID int IDENTITY,
YourString varchar(MAX),
SegmentSize int);
INSERT INTO dbo.YourTable (YourString, SegmentSize)
VALUES('AaaaaBbbbbCccccDdddd',5),
('1234567890',3);
GO
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP(SELECT MAX(LEN(YourString)) FROM dbo.YourTable)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4, N N5, N N6, N N7)
SELECT YT.ArbitraryID,
SUBSTRING(YT.YourString, T.I, YT.SegmentSize) AS sp
FROM dbo.YourTable YT
JOIN Tally T ON LEN(YT.YourString) >= T.I
WHERE (T.I - 1) % YT.SegmentSize = 0
ORDER BY YT.ArbitraryID,
T.I;
GO
--Clean up
DROP TABLE dbo.YourTable;
If you are on 2022+, then GENERATE_SERIES
significantly simplifies this:
DECLARE @s varchar(MAX),
@m bigint; --bigint as we're using a MAX length string. If you aren't, use int
SET @s = '1234567890';
SET @m = 3;
SELECT SUBSTRING(@s, GS.value, @m)
FROM GENERATE_SERIES(CONVERT(bigint,1),LEN(@s), @m) GS --All values have to be the same data type, hence explicit CONVERT on the first parameter
ORDER BY GS.value;
GO
-- Or with the prior table
SELECT YT.ArbitraryID,
SUBSTRING(YT.YourString, GS.value, YT.SegmentSize)
FROM dbo.YourTable YT
CROSS APPLY GENERATE_SERIES(CONVERT(bigint,1), LEN(YT.YourString), CONVERT(bigint,YT.SegmentSize)) GS --All values have to be the same data type, hence explicit CONVERTs
ORDER BY YT.ArbitraryID,
GS.value;