Search code examples
postgresqlindexingdatabase-performancedatabase-replication

Performance Tuning: Create index for boolean column


I have written a daemon processor which will fetch rows from one database and insert them into another for synchronizing. It will fetch rows based on a boolean indication flag sync_done.

My table has hundreds of thousands of rows. When I select all rows with sync_done is false, will it cause any database performance issues? Should I apply indexing for that sync_done column to improve performance, since only rows with a sync_done value of false are fetched?

Say, I have 10000 rows. Of those, 9500 have already been synchronized (sync_done is true) and will not be selected.

Please suggest how I might proceed.


Solution

  • For a query like this, a partial index covering only unsynced rows would serve best.

    CREATE INDEX ON tbl (id) WHERE sync_done = FALSE;
    

    However, for a use case like this, other synchronization methods may be preferable to begin with: