Search code examples
azureu-sql

Insert new rows, continue existing rowset row_number count


I'm attempting to perform some sort of upsert operation in U-SQL where I pull data every day from a file, and compare it with yesterdays data which is stored in a table in Data Lake Storage.

I have created an ID column in the table in DL using row_number(), and it is this "counter" I wish to continue when appending new rows to the old dataset. E.g. Last inserted row in DL table could look like this:

ID | Column1    | Column2
---+------------+---------
10 | SomeValue  |    1

I want the next rows to have the following ascending ids

11 | SomeValue  |   1 
12 | SomeValue  |   1 

How would I go about making sure that the next X rows continues the ID count incrementally such that the next rows each increases the ID column by 1 more than the last?


Solution

  • You could use ROW_NUMBER then add it to the the max value from the original table (ie using CROSS JOIN and MAX). A simple demo of the technique:

    DECLARE @outputFile string = @"\output\output.csv";
    
    @originalInput =
        SELECT *
        FROM ( VALUES
            ( 10, "SomeValue 1", 1 )
            ) AS x ( id, column1, column2 );
    
    
    @newInput =
        SELECT *
        FROM ( VALUES
            ( "SomeValue 2", 2 ),
            ( "SomeValue 3", 3 )
            ) AS x ( column1, column2 );
    
    
    @output =
        SELECT id, column1, column2
        FROM @originalInput
    
        UNION ALL
    
        SELECT (int)(x.id + ROW_NUMBER() OVER()) AS id, column1, column2
        FROM @newInput
                CROSS JOIN ( SELECT MAX(id) AS id FROM @originalInput ) AS x;
    
    
    OUTPUT @output
    TO @outputFile
    USING Outputters.Csv(outputHeader:true);
    

    My results:

    Results

    You will have to be careful if the original table is empty and add some additional conditions / null checks but I'll leave that up to you.