Search code examples
sqlsql-serverdatabase-normalization

How to parse SQL table into multiple many-to-many tables


I have a Staging table with the following columns

  • VideoURL
  • Rating
  • Length
  • Thumbnail
  • Tags (comma-delimited)

Videos have a many to many relationship with the tags. I've created the following new tables:

  • Video
  • Tag
  • VideoTag

How do I parse the data in the Staging table into the three new tables? The Tag table should contain no duplicates. Also, I need to do a little formatting on the data before it is inserted into the new tables e.g. need to strip all letters from the Length column.


Solution

  • Using an auxiliary numbers table, you can split the tags column into rows while keeping it associated with the VideoURL:

    CREATE TABLE NumberPivot (NumberID INT PRIMARY KEY)
    DECLARE @intLoopCounter INT
    SELECT @intLoopCounter =0
    SET NOCOUNT ON
    
    WHILE @intLoopCounter <=999 BEGIN
       INSERT INTO NumberPivot
       VALUES (@intLoopCounter)
       SELECT @intLoopCounter = @intLoopCounter +1
    END
    GO
    
    
    SELECT
      ContentPageID,
      Substring(',' + Tags + ','
                , numberID + 1
                , Charindex(',', ',' + Tags + ',', numberID + 1) - numberid - 1) AS value 
    FROM   dbo.NumberPivot  AS np,
           Staging AS S
    WHERE  numberid <= Len(',' + Tags + ',') - 1
       AND Substring(',' + Tags + ',', numberID, 1) = ',' 
    

    So here we fill the Tags table with unique tags:

    ;WITH X AS (
    SELECT
      VideoURL,
      Substring(',' + Tags + ',', numberID + 1, Charindex(',', ',' + Tags + ',', numberID + 1) - numberid - 1) AS Tag
    FROM   dbo.NumberPivot  AS np,
           Staging AS S
    WHERE  numberid                                      <= Len(',' + Tags + ',') - 1
       AND Substring(',' + Tags + ',', numberID, 1) = ',' 
    )
    INSERT Tag (Tag)
    SELECT DISTINCT Tag FROM X;
    

    Next fill the Videos table:

    INSERT Video (VideoURL, Rating, Length, Thumbnail)
    SELECT VideoURL, Rating, Length, Thumbnail
    FROM Staging;
    

    Finally fill VideoTag:

    INSERT VideoTag (VideoURL, Tag)
    SELECT
      VideoURL,
      Substring(',' + Tags + ',', numberID + 1, Charindex(',', ',' + Tags + ',', numberID + 1) - numberid - 1) AS Tag
    FROM   dbo.NumberPivot  AS np,
           Staging AS S
    WHERE  numberid                                      <= Len(',' + Tags + ',') - 1
       AND Substring(',' + Tags + ',', numberID, 1) = ',' 
    

    Got the split string using number table from here