Search code examples
sql-servert-sqlsplitpositionsql-server-2016

Multivalue data into columns into rows


I am trying to split data from various fields and then I have to take it from columns into rows. Currently all the charges for 1 client is stored in one field. So is all the descriptions of the charges stored in another field. I have data that currently looks like this in SQL

ID              ChargeType    ChargeAmount
1000:1597       F^F^F         1000^500^250
01000:6597      F^F^F^F^F     500^250^50^2000^1000
00010:0001      F             70

I have used mainly XML to split the charges and the charge types into columns. I have successfully split charges and the amount and able to join but that gave me the incorrect count of charges.

Create table Charges(
        IDx int,
        ID varchar(55),
        ChargeAmount varchar(55),
        ChargeType varchar(55)
        )


DECLARE @chargetype TABLE(IDx int Identity, ID varchar(max), data1 varchar(max))

INSERT INTO @chargetype SELECT ID, ChargesType from TrnDeal
SELECT 
F1.IDx,
F1.ID,
O.ChargesType
into #tempsplitchargetype
FROM
 (
 SELECT *,
 cast('<Y>'+replace(F.data1,'^','</Y><Y>')+'</Y>' as XML) as xmlfilter
  from @chargetype F
  )F1
  CROSS Apply
  ( 
  SELECT fdata1.D.value('.','varchar(50)') as ChargesType
  FROM F1.xmlfilter.nodes('Y') as fdata1(D)) O

 DECLARE @chargeamount TABLE(IDx int IDENTITY, ID varchar(max), data1                     
varchar(max))
INSERT INTO @chargeamount SELECT ID, ChargesAmount from TrnDeal
 SELECT 
S1.IDx,
S1.ID,
M.ChargeAmount
into #tempsplitchargeamount
FROM
 (
 SELECT *,
 cast('<X>'+replace(S.data1,'^','</X><X>')+'</X>' as XML) as xmlfilter
 from @chargeamount S
 )S1
 CROSS APPLY
 ( 
 SELECT Fdata.D.value('.','varchar(50)') as ChargeAmount 
 FROM S1.xmlfilter.nodes('X') as fdata(D)) M

-- Insert into dbo.Charges
 Select CA.IDx, CA.ID, CA.ChargeAmount, b.ChargesType
    from #tempsplitchargeamount CA
    outer apply (Select IDx, ChargesType from #tempsplitchargetype) b
    where CA.IDx = b.IDx

I am expecting the data to look as follow for 1 client as example.

ID              ChargeType    ChargeAmount
1000:1597       F             1000
1000:1597       F             500
1000:1597       F             250
01000:6597      F             500
01000:6597      F             250
01000:6597      F             50
01000:6597      F             2000
01000:6597      F             1000

Solution

  • You tagged this with [SSMS-2017]. This is not enough to be sure about your RDBMS, but I assume, that you are using SQL-Server 2016 or a higher version...

    First of all I'd like to state, that your most important question is: Why do you store this in CSV-format? This is breaking 1.NF... If you can change the table's layout, you should use a related side-table.

    But - if you have to stick with it - you can use a trick with JSON:

    DECLARE @mockup TABLE(ID VARCHAR(100),ChargeType VARCHAR(100),ChargeAmount VARCHAR(100));
    INSERT INTO @mockup(ID,ChargeType,ChargeAmount) VALUES
     ('1000:1597','F^F^F','1000^500^250')
    ,('01000:6597','F^F^F^F^F','500^250^50^2000^1000')
    ,('00010:0001','F','70');
    
    SELECT m.ID
          ,B.[key] AS Position
          ,B.[value] AS ChargeType
          ,JSON_VALUE(A.AmountAsJson,CONCAT('$[',B.[key],']') COLLATE Latin1_General_BIN2) AS ChargeAmount
    FROM @mockup m
    CROSS APPLY (SELECT JSON_QUERY('["' + REPLACE(m.ChargeAmount,'^','","') + '"]')) A(AmountAsJson)
    CROSS APPLY OPENJSON('["' + REPLACE(m.ChargeType,'^','","') + '"]') B 
    ORDER BY m.ID 
            ,Position;
    

    The result:

    +------------+----------+------------+--------------+
    | ID         | Position | ChargeType | ChargeAmount |
    +------------+----------+------------+--------------+
    | 00010:0001 | 0        | F          | 70           |
    +------------+----------+------------+--------------+
    | 01000:6597 | 0        | F          | 500          |
    +------------+----------+------------+--------------+
    | 01000:6597 | 1        | F          | 250          |
    +------------+----------+------------+--------------+
    | 01000:6597 | 2        | F          | 50           |
    +------------+----------+------------+--------------+
    | 01000:6597 | 3        | F          | 2000         |
    +------------+----------+------------+--------------+
    | 01000:6597 | 4        | F          | 1000         |
    +------------+----------+------------+--------------+
    | 1000:1597  | 0        | F          | 1000         |
    +------------+----------+------------+--------------+
    | 1000:1597  | 1        | F          | 500          |
    +------------+----------+------------+--------------+
    | 1000:1597  | 2        | F          | 250          |
    +------------+----------+------------+--------------+
    

    The idea in short:

    String splitting is a pain within SQL-Server's TSQL. There were various workarounds using loops, recursive CTEs or XML (as you do above).
    With v2016 developers were very happy about the new function STRING_SPLIT(), but MS forgot to provide the fragment's position. This makes STRING_SPLIT() a pretty useless function...

    But - together with STRING_SPLIT - JSON support was introduced. Reading a simple JSON-array with OPENJSON will provide the fragment's position in the column [key] (attention: zero-based!).

    The code above will first create a mockup-table and fill it with your sample data.
    The firstCROSS APPLY will transform the ChargeAmount into a JSON-array without doing anything with it.
    The secondCROSS APPLY will transform the ChargeType into a JSON-array and return one row per fragment.
    The select then uses JSON_VALUE to pick the corresponding value out of AmountAsJson by using the [key] (the position) as JSON path.

    Hope this was clear...