Search code examples
sqlsql-serverconcatenation

Concatenate onto Next Row


I have some SQL that does some manipulation to the data i.e. filling in empty columns.

SELECT *,
    ModifiedLineData = CASE
        WHEN Column2 = '' AND LineData NOT LIKE ',,,0,,,,0'
            THEN CONCAT(STUFF(LineData, CHARINDEX(',', LineData, CHARINDEX(',', LineData) + 1), 0, '"No PO Number"'), ',""')
        ELSE CONCAT(LineData, ',""')
    END
FROM (
    SELECT
        *,
        Column2 = CONVERT(XML, '<s>' + REPLACE((SELECT ISNULL(LineData, '') FOR XML PATH('')), ',', '</s><s>') + '</s>').value('/s[2]', 'varchar(100)')
    FROM [dbo].[Temp_Raw_Data]
    WHERE LineData NOT LIKE ',,,0,,,,0'
) AS Subquery

Now lets say this returns

FileName LineNumber LineData Column2 ModifiedLineData
file1 4 1232,,"product-1", 1,0 1232,NA,"product-1", 1,0
file2 7 "failed" NULL "failed"
file3 8 1235,,"product-2", 1,0 1235,NA,"product-2", 1,0

How can I modify this query so that if Column2 is NULL then it would concatenate the LineData onto the next row (ModifiedLineData) else just concatenate a ,"" and then remove that NULL result (if possible else it doesnt matter) so that my result would look like:

FileName LineNumber LineData Column2 ModifiedLineData
file1 4 1232,,"product-1", 1,0 1232,NA,"product-1", 1,0,""
file3 8 1235,,"product-2", 1,0 1235,NA,"product-2", 1,0,"failed"

I tried playing around with LEAD() but couldn't get it how i wanted.

Note: Two null rows are not possible to be together. This is due to the nature of the data. The next row should simply be the next available row when selecting all rows as they are imported one by 1.

Updated Query that isn't concatenating:

SELECT * 
  FROM (SELECT FileName, LineNumber, LineData, Column2, 
               CASE WHEN LAG(Column2) OVER(ORDER BY LineNumber) IS NULL
                    THEN CONCAT_WS(', ',
                                ModifiedLineData, 
                                LAG(ModifiedLineData) OVER(ORDER BY LineNumber))
                    ELSE ModifiedLineData
               END AS ModifiedLineData
        FROM (
        SELECT *,
            ModifiedLineData = CASE
                WHEN Column2 = '' AND LineData NOT LIKE ',,,0,,,,0'
                    THEN CONCAT(STUFF(LineData, CHARINDEX(',', LineData, CHARINDEX(',', LineData) + 1), 0, '"No PO Number"'), '')
                ELSE CONCAT(LineData, '')
            END
        FROM (
            SELECT *,
                Column2 = CONVERT(XML, '<s>' + REPLACE((SELECT ISNULL(LineData, '') FOR XML PATH('')), ',', '</s><s>') + '</s>').value('/s[2]', 'varchar(100)')
            FROM [backstreet_WMS_Optimizer].[dbo].[Temp_GoodsIn_Raw_Data]
            WHERE LineData NOT LIKE ',,,0,,,,0'
        ) AS Subquery
    ) AS cte
) AS Subquery
WHERE Column2 IS NOT NULL
order by FileName, LineNumber

Solution

  • Given that you can't have consecutive NULL values, using LEAD/LAG should be suitable for this task. Without knowledge of your original data, we can work on your query and add on top two subqueries, last of which is optional:

    • the inner adds the information needed to the record successive to "Column2=NULL" records
    • the outer removes records having those null values
    SELECT * 
      FROM (SELECT FileName, LineNumber, LineData, Column2, 
                   CASE WHEN LAG(Column2) OVER(ORDER BY LineNumber) IS NULL
                        THEN CONCAT_WS(', ',
                                    ModifiedLineData, 
                                    LAG(ModifiedLineData) OVER(ORDER BY LineNumber))
                        ELSE ModifiedLineData
                   END AS ModifiedLineData
            FROM <your query>) cte
    WHERE Column2 IS NOT NULL 
    

    Output:

    FileName LineNumber LineData Column2 ModifiedLineData
    file1 4 1232,,"product-1", 1,0 1232,NA,"product-1", 1,0
    file3 8 1235,,"product-2", 1,0 1235,NA,"product-2", 1,0"failed"

    Check the demo here.