Search code examples
postgresqlbusiness-intelligencedatabase-replication

Create a customized slave of postgresql


I need to create a slave for BI purposes and I need to modify some tables (e.g., remove all passwords or sensitive data). My database is PostgreSQL. I wonder if I can do it in database layer or I should do it programmatically by writing a code to do the replication.


Solution

  • You could use logical replication and have replica enabled triggers (that fire ony on replication) that modify the data when they are applied:

    ALTER TABLE mytab DISABLE TRIGGER mytrig;
    ALTER TABLE mytab ENABLE REPLICA TRIGGER mytrig;
    

    You have to make sure that no replication conflicts can arise from these modifications. For example, never modify a key column.

    Replication conflicts would stop replication and break your system.

    The traditional way to solve this problem is to use an ETL process. That way you can have a different data model on the target database and for example pre-aggregate data so that the data warehouse doesn't grow too big and has a data model optimized for analytical queries.