Search code examples
databasepostgresqlnormalization

Is a primary key required? (6NF implementation)


Is a primary key required in the 6NF relvar tables of a 5NF relation. Consider the following setup:

-- The 5NF table
CREATE TABLE party_address(
    address_id int NOT NULL,
    party_id int NOT NULL,
    -- some other columns
    PRIMARY KEY (address_id, party_id)
);

CREATE TABLE party_address_is_billing(
    address_id int NOT NULL,
    party_id int NOT NULL,
    value boolean NOT NULL,
    transaction_time tstzrange NOT NULL DEFAULT tstzrange(CURRENT_TIMESTAMP, NULL, '[)'),
    EXCLUDE USING GIST (address_id WITH =, party_id WITH =, transaction_time WITH &&)
);

Is it required to explicitly declare PRIMARY KEY on the party_address_is_billing table? Since the exclusion constraint specifies a unique identifier ((address_id, party_id, transaction_time)), it seems redundant to explicitly specify PRIMARY KEY (address_id, party_id, transaction_time). It would also create an additional and unnecessary index.

  • What are the consequences of not specifying PRIMARY KEY on a table?

Solution

  • The relational model requires that every relation have at least one key. It doesn't require you to use particular keywords.

    In your 5NF table, if you declare the key as not null unique (address_id, party_id), you won't alter the underlying dependencies at all. And that's what you have to be concerned about from the relational point of view--that alternative syntax doesn't screw up the underlying dependencies. (Which means the relational model isn't concerned with an implementation's "additional and unnecessary index".)

    So, in "party_address_is_billing", if the exclusion constraint behaves like a key constraint, I think you're fine as far as the relational model is concerned.

    What are the consequences of not specifying PRIMARY KEY on a table?

    1. If you replace a primary key constraint with another declarative constraint that's behaviorally identical, you might surprise maintenance programmers. See the article Principle of least astonishment.

    2. You might also make framework programmers work harder. Many frameworks require not just a PRIMARY KEY constraint, they also require that constraint be a surrogate integer. (This isn't really a relational issue.)

    3. Some software engineering (CASE) tools might choke on an exclusion constraint.

    4. The error message associated with a process that violates an exclusion constraint might be less clear than an error message about a violation of a primary key constraint. (This is related to "1", above.)

    I don't think there are any consequences as far as the relational model is concerned. (That is, as long as you replace a primary key constraint with a declarative constraint that's behaviorally identical.)