I have a Staging
table with the following columns
Videos have a many to many relationship with the tags. I've created the following new tables:
How do I parse the data in the Staging table into the three new tables? The Tag table should contain no duplicates. Also, I need to do a little formatting on the data before it is inserted into the new tables e.g. need to strip all letters from the Length column.
Using an auxiliary numbers table, you can split the tags column into rows while keeping it associated with the VideoURL:
CREATE TABLE NumberPivot (NumberID INT PRIMARY KEY)
DECLARE @intLoopCounter INT
SELECT @intLoopCounter =0
SET NOCOUNT ON
WHILE @intLoopCounter <=999 BEGIN
INSERT INTO NumberPivot
VALUES (@intLoopCounter)
SELECT @intLoopCounter = @intLoopCounter +1
END
GO
SELECT
ContentPageID,
Substring(',' + Tags + ','
, numberID + 1
, Charindex(',', ',' + Tags + ',', numberID + 1) - numberid - 1) AS value
FROM dbo.NumberPivot AS np,
Staging AS S
WHERE numberid <= Len(',' + Tags + ',') - 1
AND Substring(',' + Tags + ',', numberID, 1) = ','
So here we fill the Tags table with unique tags:
;WITH X AS (
SELECT
VideoURL,
Substring(',' + Tags + ',', numberID + 1, Charindex(',', ',' + Tags + ',', numberID + 1) - numberid - 1) AS Tag
FROM dbo.NumberPivot AS np,
Staging AS S
WHERE numberid <= Len(',' + Tags + ',') - 1
AND Substring(',' + Tags + ',', numberID, 1) = ','
)
INSERT Tag (Tag)
SELECT DISTINCT Tag FROM X;
Next fill the Videos table:
INSERT Video (VideoURL, Rating, Length, Thumbnail)
SELECT VideoURL, Rating, Length, Thumbnail
FROM Staging;
Finally fill VideoTag:
INSERT VideoTag (VideoURL, Tag)
SELECT
VideoURL,
Substring(',' + Tags + ',', numberID + 1, Charindex(',', ',' + Tags + ',', numberID + 1) - numberid - 1) AS Tag
FROM dbo.NumberPivot AS np,
Staging AS S
WHERE numberid <= Len(',' + Tags + ',') - 1
AND Substring(',' + Tags + ',', numberID, 1) = ','
Got the split string using number table from here