Search code examples
postgresqltriggersforeign-keysone-to-many

How to enforce a one-to-many relationship in PostgreSQL where there is no exact foreign key match between child and parent rows?


I'm having trouble modeling data that has a parent table with a start and end date in its primary key, and a child table with a timestamp in its primary key that must fall within the range of the parent table's start and end dates. In fact, this problem is nested, as that parent table is actually the child to another table - a "grandparent" table - which also has start and end dates in its primary key; the parent table's start and end dates must likewise fit within the range of the grandparent table's start and end dates.

For background, I work at a water treatment company. We treat water by deploying water treatment machines to various sites as part of treatment contracts. In more specific terms:

  • There are various sites that need their water treated.
  • The sites create contracts with us so that we can treat water. The contracts always have a known start date, but the contracts can be for either a specific period of time or indefinitely, so the end date can be known or unknown (so NULLable end dates)
  • A single water treatment machine is deployed to a site at a time in order to fulfill contract requirements. If a machine breaks down in the middle of a contract and it needs to be replaced, we replace it with another machine under the same contract.
  • While machines are treating water under a contract, we collect treatment data from them.

Thus, we have to keep track of sites, treatment_contracts, machine_deployments, machines, and treatment_datapoints. A site can have multiple treatment_contracts, a treatment_contract can have multiple machine_deployments and multiple treatment_datapoints, and a machine can have multiple machine_deployments.

So a simplified version of the data I'm trying to model is this:

CREATE TABLE public.site
(
    id integer NOT NULL,
    PRIMARY KEY (id)
);

CREATE TABLE public.treatment_contract
(
    site_id integer NOT NULL,
    start_date date NOT NULL,
    end_date date,
    PRIMARY KEY (site_id, start_date, end_date)
    CONSTRAINT fk_treatment_contract__site FOREIGN KEY (site_id)
        REFERENCES public.site (site_id) MATCH SIMPLE
);

CREATE TABLE public.machine_deployment
(
    site_id integer NOT NULL,
    machine_id integer NOT NULL,
    start_date date NOT NULL,
    end_date date,
    PRIMARY KEY (site_id, machine_id, start_date, end_date),
    CONSTRAINT fk_machine_deployment__machine FOREIGN KEY (machine_id)
        REFERENCES public.machine (id) MATCH SIMPLE,
    <some provision to require that machine_deployment.start_date and machine_deployment.end_date are between treatment_contract.start_date and treatment_contract.end_date, and that machine_deployment.site_id matches treatment_contract.site_id>
);

CREATE TABLE public.treatment_datapoint
(
    site_id integer NOT NULL,
    time_stamp timestamp NOT NULL,
    PRIMARY KEY (site_id, time_stamp),
    <some provision to require time_stamp is between treatment_contract.start_date and treatment_contract.end_date, and that treatment_datapoint.site_id matches treatment_contract.site_id>
);

CREATE TABLE public.machine
(
    id integer NOT NULL,
    PRIMARY KEY (id)
);

I'm not sure how to proceed because PostgreSQL can only enforce foreign key relationships where there is an exact match between all foreign key fields - there is no provision in foreign key constraints that can enforce something like child.timestamp BETWEEN parent.start AND parent.end. treatment_datapoint should have a foreign key to treatment_contract, as a treatment_datapoint without a treatment_contract would make no sense, but there seems to be no way to enforce this foreign key relationship. Is the answer just to use triggers instead? I've always been told to avoid using triggers to define parent:child relationships, as that's what foreign keys are for.

Either way, though, there's got to be a way to model this, as I can't imagine that I'm the only one who's ever needed to enforce that a date within a child table is within a range defined in the parent table.


Solution

  • In short: to enforce relationship where there is no foreign key - make one.

    For your model to work you have to have a foreign key to treatment_contract and since the primary key of treatment_contract contains fields site_id, start_date, end_date you have to add the contract_start_date and contract_end_date to the tables you need to reference the contract, namely machine_deployment and treatment_datapoint.

    To make your life easier I'd advice against using NULL for a not yet known end date of a contract and machine deployment. I would consider it to be a "magic number" that means "infinity". This is not required but makes checks simpler.

    Also I'd add a check constraint to ensure a contract ends after it starts.

    And lastly you can use a check constraint to validate deployment start and end and datapoint timestamp.

    In the example bellow I use daterange and range operators in my checks. This is for convenience. You can achieve the same result with comparison operators (<,<=...).

    My proposed variant of your schema is:

    CREATE TABLE public.site
    (
        id integer NOT NULL,
        PRIMARY KEY (id)
    );
    
    CREATE TABLE public.treatment_contract
    (
        site_id integer NOT NULL,
        start_date date NOT NULL,
        end_date date NOT NULL,
        PRIMARY KEY (site_id, start_date, end_date),
        CONSTRAINT fk_treatment_contract__site FOREIGN KEY (site_id)
            REFERENCES public.site (id) MATCH SIMPLE
    );
    
    CREATE TABLE public.machine
    (
        id integer NOT NULL,
        PRIMARY KEY (id)
    );
    
    CREATE TABLE public.machine_deployment
    (
        site_id integer NOT NULL,
        machine_id integer NOT NULL,
        contract_start_date date NOT NULL,
        contract_end_date date NOT NULL,
        start_date date NOT NULL,
        end_date date NOT NULL,
        PRIMARY KEY (site_id, machine_id, start_date, end_date),
        CONSTRAINT fk_machine_deployment__machine FOREIGN KEY (machine_id)
            REFERENCES public.machine (id) MATCH SIMPLE,
        CONSTRAINT fk_machine_deployment__treatment_contract FOREIGN KEY (site_id, contract_start_date, contract_end_date)
            REFERENCES public.treatment_contract(site_id, start_date, end_date),
        CONSTRAINT chk_machine_deploiment_period CHECK (start_date <= end_date),    
        CONSTRAINT chk_machine_deploiment_in_contract CHECK (pg_catalog.daterange(start_date, end_date,'[]') <@ pg_catalog.daterange(contract_start_date, contract_end_date, '[]'))
    );
    
    CREATE TABLE public.treatment_datapoint
    (
        site_id integer NOT NULL,
        contract_start_date date NOT NULL,
        contract_end_date date NOT NULL,
        time_stamp timestamp NOT NULL,
        PRIMARY KEY (site_id, time_stamp),
        CONSTRAINT fk_treatment_datapoint__treatment_contract FOREIGN KEY (site_id, contract_start_date, contract_end_date)
            REFERENCES public.treatment_contract(site_id, start_date, end_date),
        CONSTRAINT chk_datapoint_in_contract CHECK (time_stamp::date <@ pg_catalog.daterange(contract_start_date, contract_end_date, '[]'))
    );