postgresql database-design null unique primary-key

Composite PRIMARY KEY enforces NOT NULL constraints on involved columns

This is one strange, unwanted behavior I encountered in Postgres: When I create a Postgres table with composite primary keys, it enforces NOT NULL constraint on each column of the composite combination.

For example,

CREATE TABLE distributors (m_id integer, x_id integer, PRIMARY KEY(m_id, x_id));

enforces NOT NULL constraint on columns m_id and x_id, which I don't want! MySQL doesn't do this. I think Oracle doesn't do it as well.

I understand that PRIMARY KEY enforces UNIQUE and NOT NULL automatically but that makes sense for single-column primary key. In a multi-column primary key table, the uniqueness is determined by the combination.

Is there any simple way of avoiding this behavior of Postgres? When I execute this:

CREATE TABLE distributors (m_id integer, x_id integer);

I do not get any NOT NULL constraints of course. But I would not have a primary key either.

Solution

If you need to allow NULL values, use a UNIQUE constraint (or index) instead of a PRIMARY KEY (and add a surrogate PK column - I suggest a serial or IDENTITY column in Postgres 10 or later).

Auto increment table column

A UNIQUE constraint allows columns to be NULL:

CREATE TABLE distributor (
  distributor_id GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, m_id integer
, x_id integer
, UNIQUE(m_id, x_id)  -- !
-- , CONSTRAINT distributor_my_name_uni UNIQUE (m_id, x_id)  -- verbose form
);

The manual:

For the purpose of a unique constraint, null values are not considered equal, unless NULLS NOT DISTINCT is specified.

In your case, you could enter something like (1, NULL) for (m_id, x_id) any number of times without violating the constraint. Postgres does not consider two NULL values equal - as per definition in the SQL standard.

If you need to treat NULL values as equal (i.e. "not distinct") to disallow such "duplicates", I see ~~two~~ three (since Postgres 15) options:

0. `NULLS NOT DISTINCT`

This option was added with Postgres 15 and allows to treat NULL values as "not distinct", so two of them conflict in a unique constraint or index. This is the most convenient option, going forward. The manual:

That means even in the presence of a unique constraint it is possible to store duplicate rows that contain a null value in at least one of the constrained columns. This behavior can be changed by adding the clause NULLS NOT DISTINCT ...

Detailed instructions:

Create unique constraint with null columns

1. Two partial indexes

In addition to the UNIQUE constraint above:

CREATE UNIQUE INDEX dist_m_uni_idx ON distributor (m_id) WHERE x_id IS NULL;
CREATE UNIQUE INDEX dist_x_uni_idx ON distributor (x_id) WHERE m_id IS NULL;

But this gets out of hands quickly with more than two columns that can be NULL. See:

Create unique constraint with null columns

2. A multi-column `UNIQUE` index on expressions

Instead of the UNIQUE constraint. We need a free default value that is never present in involved columns, like -1. Add CHECK constraints to disallow it:

CREATE TABLE distributor (
   distributor serial PRIMARY KEY
 , m_id integer
 , x_id integer
 , CHECK (m_id <> -1)
 , CHECK (x_id <> -1)
);

CREATE UNIQUE INDEX distributor_uni_idx
ON distributor (COALESCE(m_id, -1), COALESCE(x_id, -1));

Composite PRIMARY KEY enforces NOT NULL constraints on involved columns

0. NULLS NOT DISTINCT

1. Two partial indexes

2. A multi-column UNIQUE index on expressions

0. `NULLS NOT DISTINCT`

2. A multi-column `UNIQUE` index on expressions