Search code examples
sqlazureazure-data-factoryserverlessazure-synapse

How to add auto incremented columns while creating External Tables using CETAS in Azure Synapse serverless SQL Pool?


In server-less SQL Pool, Identities are not supported, is there any better way to add an auto-incremented column while creating an external table using Select Statement.

    CREATE EXTERNAL TABLE temp.example_table
    WITH 
    (
        DATA_SOURCE = data_source_name,
        LOCATION = 'TempTables/exanple_table',
        FILE_FORMAT = parquet_file_format
    )
    AS 


    SELECT  name                                    AS user_name
         , code                                        AS user_code
      FROM schema.example_table
 

How can we add an auto-increment column along with the name and code column in external table?

I want something like -

id user_name user_code
1 Indrajeet SinghI
2 Himanshu RawatH
3 Akshay SharmaA

Solution

  • Try ROW_NUMBER:

    
        CREATE EXTERNAL TABLE temp.example_table
        WITH 
        (
            DATA_SOURCE = data_source_name,
            LOCATION = 'TempTables/exanple_table',
            FILE_FORMAT = parquet_file_format
        )
        AS 
    
    
        SELECT  ROW_NUMBER() OVER (ORDER BY code) as id
             , name                                    AS user_name
             , code                                        AS user_code
          FROM schema.example_table
    
    

    One warning. If you run the same tomorrow it won’t necessarily be the same id for Akshay. So this may not be appropriate except for a one-time load.

    If stability cross days is important you might try HASHBYTES('MD5', code) as id. On a small table that should be unique. But on a large table you may have hash collisions and it not be unique.