Search code examples
sqlpostgresqldatabase-designunique-constraintrelational-division

How to secure table for avoid duplicate data


I cant resolve the problem how secure my table to avoid duplicate combination of attributes_positions. The best way to show you what I mean is the following image

enter image description here

column id_combination represents number of combination. Combination consists of attributes_positions. So Combination is sequence of attributes_positions.

And now I would secure table from insert exaclty the same sequence of attributes_positions.

Of course if already inserted combination contains one additional attributes_positions or one less than inserting combination is ok

image I show the different bettwen duplicate and not duplicate combination.

Is there a some way how I can do that?? Meaby something like 'before update'. But how to implement for this example. I`m not so pretty good with advanced sql. The database where I trying to secure table is postgresql 9.4

I will be grateful for help


Solution

  •         -- The data
    CREATE TABLE theset (
            set_id INTEGER NOT NULL PRIMARY KEY
            , set_name text UNIQUE
            );
    INSERT INTO theset(set_id, set_name) VALUES
    ( 1, 'one'), ( 2, 'two'), ( 3, 'three'), ( 4, 'four');
    
    CREATE TABLE theitem (
            item_id integer NOT NULL PRIMARY KEY
            , item_name text UNIQUE
            );
    INSERT INTO theitem(item_id, item_name) VALUES
    ( 1, 'one'), ( 2, 'two'), ( 3, 'three'), ( 4, 'four'), ( 5, 'five');
    
    CREATE TABLE set_item (
            set_id integer NOT NULL REFERENCES theset (set_id)
            , item_id integer NOT NULL REFERENCES theitem(item_id)
            , PRIMARY KEY (set_id,item_id)
            );
            -- swapped index is indicated for junction tables
    CREATE UNIQUE INDEX ON set_item(item_id, set_id);
    
    INSERT INTO set_item(set_id,item_id) VALUES
    (1,1), (1,2), (1,3), (1,4),
    (2,1), (2,2), (2,3), -- (2,4),
    (3,1), (3,2), (3,3), (3,4), (3,5),
    (4,1), (4,2), (4,4);
    
    CREATE FUNCTION set_item_unique_set( ) RETURNS TRIGGER AS
    $func$
    BEGIN
    IF EXISTS ( -- other set
            SELECT * FROM theset oth
            -- WHERE oth.set_id <> NEW.set_id -- only for insert/update
            WHERE TG_OP = 'DELETE' AND oth.set_id <> OLD.set_id
               OR TG_OP <> 'DELETE' AND oth.set_id <> NEW.set_id
    
            -- count (common) members in the two sets
            -- items not in common will have count=1
            AND NOT EXISTS (
                    SELECT item_id FROM set_item x1
                    WHERE (x1.set_id = NEW.set_id OR x1.set_id = oth.set_id )
                    GROUP BY item_id
                    HAVING COUNT(*) = 1
                    )
    
            ) THEN
            RAISE EXCEPTION 'Not unique set';
            RETURN NULL;
    ELSE
            RETURN NEW;
    END IF;
    
    END;
    $func$ LANGUAGE 'plpgsql'
            ;
    
    CREATE CONSTRAINT TRIGGER check_item_set_unique
            AFTER UPDATE OR INSERT OR DELETE
            -- BEFORE UPDATE OR INSERT
            ON set_item
            FOR EACH ROW
            EXECUTE PROCEDURE set_item_unique_set()
            ;
    
    -- Test it
    INSERT INTO set_item(set_id,item_id) VALUES(4,5); -- success
    INSERT INTO set_item(set_id,item_id) VALUES(2,4); -- failure
    DELETE FROM set_item WHERE set_id=1 AND item_id= 4; -- failure
    

    Note: There should also be a trigger for the DELETE case.


    UPDATE: added handling of DELETE

    (the handling of deletes is not perfect; imagine the case where the last element from a set is removed)