Search code examples
sql-serverdenormalized

How to denormalize this SQL Server table?


I have a very wide table denormalized (if you can say that) like this, the option columns go to 100+

Year  ProductID ProductName Option1 Option2 Option3 ....Option100
-----------------------------------------------------------------
2016    1        Test1        A1      A1a     A3
2015    1        Test1        A1      A2      A2a

The problem is we have dynamic queries trying to determine the option column and then finding the option value in them

i.e.

@SQL= 'SELECT Option' + @getOptionNum +' FROM ProductMapping

Ideally I want this converted to something like this

Year ProductID ProductName OptionName
-------------------------------------
2016   1         Test1      Option1
2016   1         Test1      Option2
2016   1         Test1      Option3
2015   1         Test1      Option1
2015   1         Test1      Option2
2015   1         Test1      Option3

OptionID OptionName OptionValue  Year
-------------------------------------
  1        Option1     A1        2016
  2        Option2     A1a       2016
  3        Option3     A3        2016
  4        Option1     A1        2015
  5        Option2     A2        2015
  6        Option3     A2a       2015

SELECT * 
FROM ProductMapping map
LEFT JOIN OptionList list ON map.OptionName = list.OptionName
                          AND map.Year = list.Year
                          AND map.OptionName = 'Option1'

The problem I am running into is how to convert that wide table into the two tables structure through queries since it's a lot of columns and rows and I cannot normalize all of that manually.

Yes I also understand ideally the 2nd table needs to be normalized further to keep the Option1...Option3 in a separate table and the Option1..A1 mapping in a separate table but it's a start...

Hopefully the simple example sheds light of the following facts

  1. Option1...100 columns need to be normalized in a separate table
  2. The option columns to values mapping changes every year

Any thoughts?


Solution

  • Declare @YourTable table (Year int,ProductID int,ProductName varchar(50),Option1 varchar(50),Option2 varchar(50),Option3 varchar(50))
    Insert into @YourTable values
    (2016,1,'Test1','A1','A1a','A3'),
    (2015,1,'Test1','A1','A2','A2a')
    
    Declare @XML xml
    Set @XML = (Select * from @YourTable for XML RAW)
    
    Select * 
     From  (
            Select Year        = r.value('@Year','int')
                  ,ProductID   = r.value('@ProductID','int')
                  ,ProductName = r.value('@ProductName','varchar(50)')
                  ,OptionName  = Attr.value('local-name(.)','varchar(100)')
                From  @XML.nodes('/row') as A(r)
                Cross Apply A.r.nodes('./@*[local-name(.)!="ProductID"]') as B(Attr)
           ) A
     Where OptionName Like 'Option%'
     Order by Year Desc,OptionName
    
    Select OptionID=Row_Number() over (Order By Year Desc,OptionName),* 
     From (
            Select OptionName  = Attr.value('local-name(.)','varchar(100)')
                  ,OptionValue = Attr.value('.','varchar(100)') 
                  ,Year        = r.value('@Year','int')
             From  @XML.nodes('/row') as A(r)
             Cross Apply A.r.nodes('./@*[local-name(.)!="ProductID"]') as B(Attr)
             --CROSS APPLY A.r.nodes('./@*') AS B(Attr)
            ) A 
     Where OptionName Like 'Option%'
    

    Returns

    Year    ProductID   ProductName OptionName
    2016    1           Test1       Option1
    2016    1           Test1       Option2
    2016    1           Test1       Option3
    2015    1           Test1       Option1
    2015    1           Test1       Option2
    2015    1           Test1       Option3
    

    and

    OptionID    OptionName  OptionValue Year
    1           Option1     A1          2016
    2           Option2     A1a         2016
    3           Option3     A3          2016
    4           Option1     A1          2015
    5           Option2     A2          2015
    6           Option3     A2a         2015
    

    EDIT

    Now if you wanted a STRAIGHT Normalization

    Declare @YourTable table (Year int,ProductID int,ProductName varchar(50),Option1 varchar(50),Option2 varchar(50),Option3 varchar(50))
    Insert into @YourTable values
    (2016,1,'Test1','A1','A1a','A3'),
    (2015,1,'Test1','A1','A2','A2a')
    
    Declare @XML xml
    Set @XML = (Select * from @YourTable for XML RAW)
    
    
    Select ID    = r.value('@id','int')                             --<<'@id' Should be YOUR PK
          ,Item  = Attr.value('local-name(.)','varchar(100)')
          ,Value = Attr.value('.','varchar(max)') 
     From  @XML.nodes('/row') as A(r)
     Cross Apply A.r.nodes('./@*[local-name(.)!="id"]') as B(Attr)  --<<'id' Should be YOUR PK
     --CROSS APPLY A.r.nodes('./@*') AS B(Attr)
    

    Returns (the nulls would normally be your PK)

    ID      Item        Value
    NULL    Year        2016
    NULL    ProductID   1
    NULL    ProductName Test1
    NULL    Option1     A1
    NULL    Option2     A1a
    NULL    Option3     A3
    NULL    Year        2015
    NULL    ProductID   1
    NULL    ProductName Test1
    NULL    Option1     A1
    NULL    Option2     A2
    NULL    Option3     A2a