Search code examples
sqlpostgresqldatabase-designmany-to-many

What is the best way to implement a many-to-many relationship in PostgreSQL?


I'm trying to implement a many-to-many relationship in PostgreSQL. Here are the tables from my database:

CREATE TABLE products (
   product_id   serial PRIMARY KEY
 , product_name varchar NOT NULL
);

CREATE TABLE orders (
   order_id   serial PRIMARY KEY
 , order_name varchar NOT NULL
);

CREATE TABLE product_order (
   product_id  int REFERENCES products
 , order_id    int REFERENCES orders
 , PRIMARY KEY (product_id, order_id)
);

There will not be any UPDATEs or DELETEs in the products and orders tables, so there is no need for ON DELETE and ON UPDATE statements.

I have also created two hash indexes so I can search for orders and products names and get their id stored in the table:

CREATE INDEX product_index ON products USING hash(product_name);
CREATE INDEX order_index ON orders USING hash(order_name);

Here's what I'm trying to do:

  1. I want to insert into the product table and return the id of the inserted row.
  2. I want to insert into the order table and return the id of the inserted row.
  3. I want to insert both the product_id and order_id into the product_order table.

There is an edge case:

If the product that I want to insert is already in the product table, then I don't want to create another row with a different id. In this case, I want to retrieve the product_id that is already in the table.

This edge case is the same for order.

To accomplish all these, I've created an SQL FUNCTION:

CREATE OR REPLACE FUNCTION add_product_order(myproduct varchar, myorder varchar)
RETURNS VOID
LANGUAGE sql
AS
$$
    WITH pro AS (
        WITH p as (
            SELECT product_id FROM products WHERE product_name = myproduct -- check if product is already in the table
        )
        INSERT into products (product_name) -- insert into products and get the product_id only if myproduct was not found
            SELECT (myproduct)
            WHERE NOT EXISTS (
                SELECT product_id FROM p
            )
        RETURNING product_id
    ),

    ord AS (
        WITH o as(
            SELECT order_id FROM orders WHERE order_name = myorder -- check if order is already in the table
        )
        INSERT into orders (order_name) -- insert into orders and get the order_id only if myorder was not found
            SELECT (myorder)
            WHERE NOT EXISTS (
                SELECT order_id FROM o
            )
        RETURNING order_id
    )
    INSERT INTO product_order (product_id, order_id) -- insert both FK ids into the product_order table
        SELECT pro.product_id, ord.order_id FROM pro, ord;
$$;

After creating the function, I execute the following SQL query to run it:

select add_product_order('product1','order1');

Everything seems to be fine, but it only works when the product I'm trying to insert is not in the table.

If the product is already in the table, the first SELECT returns the product_id in the temporary p table. But I don't know how to get a hold of p.product_id in the last INSERT INTO product_order.

Seeing I can't get too far with this, I also tried with a plpgsql FUNCTION:

CREATE OR REPLACE FUNCTION add_product_order(myproduct varchar, myorder varchar)
RETURNS VOID
LANGUAGE plpgsql
AS
$$
DECLARE
   id_product integer;
   id_order integer;
BEGIN
    SELECT product_id INTO id_product FROM products WHERE product_name = myproduct; -- check if product is already in the table

    IF NOT FOUND THEN
        RAISE INFO 'product % not found', myproduct;

        INSERT INTO products (product_name) VALUES (myproduct) RETURNING product_id; -- product not found, so insert it and get the id
        id_product := product_id; -- Tried also with SELECT product_id INTO id_product;
    END IF;

    SELECT order_id INTO id_order FROM orders WHERE order_name = myorder; -- check if order is already in the table

    IF NOT FOUND THEN
        RAISE INFO 'order % not found', myorder;

        INSERT INTO orders (order_name) VALUES (myorder) RETURNING order_id; -- order not found, so insert it and get the id
        id_order := order_id;
    END IF;

    INSERT INTO product_order (product_id, order_id) VALUES (id_product, id_order); -- insert both ids into the product_order table
END;
$$;

This plpgsql FUNCTION should solve the problem I had in the SQL function above.

But it gives me an error: query has no destination for result data

What is the correct way to accomplish this?

PS. I searched and read multiple answers before I posted this question:


Solution

  • You can get what you want just inserting the Product Name and the Order name letting Postgres handle duplication. You elevate the indexes on them to unique constraints (no hashing needed) then let the ON CONFLICT clause handle duplication. That is what select, if not found insert logic is attempting anyway. The only issue becomes Postgres not returning the ID on duplicates. To overcome that the insert for Product_Orders retrieves the ids from the appropriate names. This can be done in a single statement with a couple CTEs but then the returning clause is necessary and whatever is returned is null where the name already exists. So it just gets ignored anyway. However a SQL function/procedure can have multiple statements, so (imho) 3 statements are clearer. (see example here)

    create or replace 
    procedure gen_product_order(
              myproduct varchar
            , myorder   varchar
            )
      language sql
    as $$
      insert into products(product_name) 
           values (myproduct) 
               on conflict (product_name) do nothing; 
     
      insert into orders (order_name)  
          values (myorder)
              on conflict (order_name) do nothing;      
     
        insert into product_orders (product_id, order_id) -- insert both fk ids into the product_order table
           select product_id, order_id 
             from (select product_id 
                     from products 
                    where product_name = myproduct
                  ) prd
                , (select order_id 
                     from orders
                    where order_name = myorder
                  ) ord  
            on conflict (product_id, order_id) do nothing; 
    $$; 
    

    Note: I've used a procedure rather that a function returning void but if you Postgres version does not support a function would work.