Search code examples
sqlsql-serversql-insertgreatest-n-per-group

PARTITION BY multiple column while inserting data from another table


How can I skip unique constraint error in SQL Server?

This is my source table:

CREATE TABLE source 
(
    RollNo INTEGER,
    Nam VARCHAR(6),
    Gender VARCHAR(1),
    Score INTEGER
);
    
INSERT INTO source (RollNo, Nam, Gender, Score)
VALUES ('101', 'John', 'M', '85'),
       ('102', 'Tracy', 'F', '79'),
       ('103', 'Jake', 'M', '92'),
       ('104', 'Edgar', 'M', NULL),
       ('105', 'Monica', 'F', '25'),
       ('106', 'Monica', 'F', '50'),
       ('1070', 'Yash', 'M', '68'),
       ('107', 'Yash', 'M', '70'),
       ('108', 'SFS', 'M', '68'),
       ('18', 'SFS77', 'F', '65');

I want populate in the dest table from source table where name & Gender is a Unique key and SeqNo should increment automatically:

dest table description:

CREATE TABLE dest 
(
    SeqNo BIGINT IDENTITY(1000,1) PRIMARY KEY,
    RollNo INTEGER,
    Nam VARCHAR(6),
    Gender VARCHAR(1),
    Score INTEGER
);

Here's what I tried:

Attempt #1:

INSERT INTO dest (RollNo, Nam, Gender, Score) 
    SELECT 
        FIRST_VALUE(RollNo) OVER (PARTITION BY Nam, Gender ORDER BY Score DESC),
        FIRST_VALUE(Nam) OVER (PARTITION BY Nam, Gender ORDER BY Score DESC),
        FIRST_VALUE(Gender) OVER (PARTITION BY Nam, Gender ORDER BY Score DESC),
        FIRST_VALUE(Score) OVER (PARTITION BY Nam, Gender ORDER BY Score DESC)
    FROM 
        source 
    WHERE 
        Nam IS NOT NULL AND Gender IS NOT NULL ;

ERROR: Violation of UNIQUE KEY constraint

Attempt #2:

INSERT INTO dest (RollNo, Nam, Gender, Score) 
    SELECT MAX(RollNo),Nam, Gender, MAX(Score) 
    FROM source
    GROUP BY Nam, Gender
    ORDER BY MAX(Score) DESC;

Output:

| SeqNo | RollNo |    Nam | Gender |  Score |
|-------|--------|--------|--------|--------|
|  1000 |    103 |   Jake |      M |     92 |
|  1001 |    101 |   John |      M |     85 |
|  1002 |    102 |  Tracy |      F |     79 |
|  1003 |   1070 |   Yash |      M |     70 |
|  1004 |    108 |    SFS |      M |     68 |
|  1005 |     18 |  SFS77 |      F |     65 |
|  1006 |    106 | Monica |      F |     50 |
|  1007 |    104 |  Edgar |      M | (null) |

If you see the row of yash it is taking max of RollNo. and max of Score which is wrong, I want it to take first value but I don't know how to do it.

Is there any other way to solve exclude this above two methods?


Solution

  • You can identify the row that has the highest score for each Nam/Gender tuple with ROW_NUMBER(), and use that information to filter the source data:

    INSERT INTO dest (RollNo, Nam, Gender, Score) 
    SELECT RollNo, Nam, Gender, Score
    FROM (
        SELECT s.*,
            ROW_NUMBER() OVER(PARTITION BY Nam, Gender ORDER BY Score DESC) rn
        FROM source s
    ) s
    WHERE rn = 1
    

    Side note: I would recommend putting a unique constraint on Nam/Gender tuples in the target table, so potential duplicates are always rejected at insert time:

    CREATE TABLE dest (
      SeqNo  BIGINT IDENTITY(1000,1) PRIMARY KEY,
      RollNo INTEGER,
      Name   VARCHAR(6),
      Gender VARCHAR(1),
      Score  INTEGER,
      UNIQUE (Name, Gender)
    );
    

    Side note #2: don't put single quotes around column names; they stand for literal strings in standard SQL.