Search code examples
postgresqldatabase-designnulluniqueprimary-key

Composite PRIMARY KEY enforces NOT NULL constraints on involved columns


This is one strange, unwanted behavior I encountered in Postgres: When I create a Postgres table with composite primary keys, it enforces NOT NULL constraint on each column of the composite combination.

For example,

CREATE TABLE distributors (m_id integer, x_id integer, PRIMARY KEY(m_id, x_id));

enforces NOT NULL constraint on columns m_id and x_id, which I don't want! MySQL doesn't do this. I think Oracle doesn't do it as well.

I understand that PRIMARY KEY enforces UNIQUE and NOT NULL automatically but that makes sense for single-column primary key. In a multi-column primary key table, the uniqueness is determined by the combination.

Is there any simple way of avoiding this behavior of Postgres? When I execute this:

CREATE TABLE distributors (m_id integer, x_id integer);

I do not get any NOT NULL constraints of course. But I would not have a primary key either.


Solution

  • If you need to allow NULL values, use a UNIQUE constraint (or index) instead of a PRIMARY KEY (and add a surrogate PK column - I suggest a serial or IDENTITY column in Postgres 10 or later).

    A UNIQUE constraint allows columns to be NULL:

    CREATE TABLE distributor (
      distributor_id GENERATED ALWAYS AS IDENTITY PRIMARY KEY
    , m_id integer
    , x_id integer
    , UNIQUE(m_id, x_id)  -- !
    -- , CONSTRAINT distributor_my_name_uni UNIQUE (m_id, x_id)  -- verbose form
    );
    

    The manual:

    For the purpose of a unique constraint, null values are not considered equal, unless NULLS NOT DISTINCT is specified.

    In your case, you could enter something like (1, NULL) for (m_id, x_id) any number of times without violating the constraint. Postgres does not consider two NULL values equal - as per definition in the SQL standard.

    If you need to treat NULL values as equal (i.e. "not distinct") to disallow such "duplicates", I see two three (since Postgres 15) options:

    0. NULLS NOT DISTINCT

    This option was added with Postgres 15 and allows to treat NULL values as "not distinct", so two of them conflict in a unique constraint or index. This is the most convenient option, going forward. The manual:

    That means even in the presence of a unique constraint it is possible to store duplicate rows that contain a null value in at least one of the constrained columns. This behavior can be changed by adding the clause NULLS NOT DISTINCT ...

    Detailed instructions:

    1. Two partial indexes

    In addition to the UNIQUE constraint above:

    CREATE UNIQUE INDEX dist_m_uni_idx ON distributor (m_id) WHERE x_id IS NULL;
    CREATE UNIQUE INDEX dist_x_uni_idx ON distributor (x_id) WHERE m_id IS NULL;
    

    But this gets out of hands quickly with more than two columns that can be NULL. See:

    2. A multi-column UNIQUE index on expressions

    Instead of the UNIQUE constraint. We need a free default value that is never present in involved columns, like -1. Add CHECK constraints to disallow it:

    CREATE TABLE distributor (
       distributor serial PRIMARY KEY
     , m_id integer
     , x_id integer
     , CHECK (m_id <> -1)
     , CHECK (x_id <> -1)
    );
    CREATE UNIQUE INDEX distributor_uni_idx
    ON distributor (COALESCE(m_id, -1), COALESCE(x_id, -1));