I have a project in postgresql to create a DWH from multiple schemas. The schemas have the same structure(tables, constrains). I am stuck on this problem : I have a table "employees" in all the schemas that i will use in DWH. The first entry in schema1 table employees is 1 John Dow, the first entry in schema2 table employees is 1 Mary Jane (the id column is auto increment). Is there a way to create a DWH with this structure ? Is there a way to partition the schemas data ? Thank you.
The easiest way will probably be to assign a number of the most significant bits of the key to become a source ID, thereby making your keys unique across all of your source databases. E.g. Let's assume you will never have more than 100 databases that you're warehousing, so a 7 bit database identifier will be sufficient.
INTEGER
(32 bit) keys would leave 24 bits remaining allowing up to 16777215 insertions, and BIGINT
(64 bit) keys will leave 56 bits remaining, allowing up to 72057594037927935 insertions. (Beware that ALTER TABLE ... ALTER COLUMN SET DATA TYPE
requires a full table rewrite and can be very slow for high volume tables.)DB_ID << 24
(for 32 bit keys) or DB_ID << 56
(for 64 bit keys) to it. E.g. For a DB with ID 13 and 32 bit keys, you would add 218103808 to every key value.CHECK CONSTRAINT
s to each table, ensuring their key is in the range [(DB_ID << 24) + 1, ((DB_ID+1) << 24) - 1]
.ALTER SEQUENCE :seq_name MINVALUE :min_val MAXVALUE :max_val START WITH :start_val RESTART WITH :start_val
, where min_val = (DB_ID << 24) + 1
, max_val = ((DB_ID+1) << 24) - 1
, and start_val = currval(SEQUENCE_NAME)
.