Search code examples
postgresqlcreate-table

postgresql : create table as union select with a serial key


I am trying to create a dimension table dim_airport from a raw_data table. The raw table has 2 airportcode columns (origin and destination). I want to get an exhaustive distinct set of all airport codes and create a table with that. In addition, I also need to add a serial key to perform auto increment. I can do this row_number but wanted to find another way to do it.

create table DIM_airport (airportkey serial primary key ) 
as 
select distinct originairportcode as airportcode, 
               origairportname as airportname,
               origincityname as city from raw_data
union
select distinct destairportcode, destairportname, destcityname 
from raw_data;

If i defined a row_number window function over airportcode it works. I am looking for a solution where it can auto increment directly without explicitly defining the value from row_number()


Solution

  • The syntax of "Create table ... As ..." does not allow defining additional columns, it defines only columns from the select statement. Of course you can persist by selecting a constant as a place holder and establish the column name. Then create a sequence, update the table to set value for the place holder and then a few alter tables to complete the desired definition.
    A much easier way would just be a 2 step process:

    1. Create the table.
    2. Populate with simple select.
    create table DIM_airport( airportkey  integer generated always as identity 
                            , airportcode text 
                            , airportname text
                            , city        text
                            , constraint  DIM_airport_pk
                                          primary key (airportkey)
                            ) ;
    
    insert into  DIM_airport(airportcode, airportname, city)
          select originairportcode  
               , origairportname 
               , origincityname
          union 
          select destairportcode  
               , destairportname  
               , destcityname ;  
    

    You do not need DISTINCT on either select as the UNION itself eliminates duplicates. See examples here.