I have a task that to implement a 'rollback' (not the usual rollback) function for a batch of entries from different tables. For example:
def rollback(cursor, entries):
# entries is a dict of such form:
# {'table_name1': [id1, id2, ...], 'table_name2': [id1, id2, ...], ...}
I need to delete entries in each table_name. But because these entries may have relationship between so a bit complex. My idea is in several steps:
My questions are:
Any idea and suggestion would be appreciate.
(1) and (2) are not right. It's quite likely that there will be columns defined NOT NULL REFERENCES othertable(othercol)
- there are in any normal schema.
What I think you need to do is to sort the foreign key dependency graph to find an ordering that allows you to DELETE
, table-by-table, the data you need to remove. Be aware that circular dependencies are possible due to deferred foreign key constraints, so you need to demote/ignore DEFERRABLE INITIALLY DEFERRED
constraints; you can temporarily violate those so long as it's all consistent again at COMMIT
time.
Even then you might run into issues. What if a client used SET CONSTRAINTS
to make a DEFERRABLE INITIALLY IMMEDIATE
constraint DEFERRED
during a transaction? You'd then fail to cope with the circular dependency. To handle this your code must [SET CONSTRAINTS ALL DEFERRED
] before proceeding.
You will need to look at the information_schema
or the PostgreSQL-specific system catalogs to work out the dependencies. It might be worth a look at the pg_dump
source code too, since it tries to order dumped tables to avoid dependency conflicts. You'll be particularly interested in the pg_constraint
catalog, or its information_schema
equivalents information_schema.referential_constraints
, information_schema.constraint_table_usage
and information_schema.constraint_column_usage
.
You can use the either the information_schema
or pg_catalog
. Don't use both. information_schema
is SQL-standard and more portable, but can be slow to query and doesn't have all the information pg_catalog
contains. On the flip side, pg_catalog
's schema isn't guaranteed to remain compatible across major versions (like 9.1 to 9.2) - though it generally does - and its use isn't portable.