Search code examples
sqlsql-serverbulkinserterror-checking

How/where to do data error checks when using a SQL Server stored procedures to load bulk insert into temp table and then load to table?


I need to load data into the final SQL Server database table after the values are checked for errors. It's the error checks for the newly loaded data that I can't figure out.

I have a stored procedure (see usp_LoadSchoolCOA) to bulk insert a .CSV file into a temp table, then into the final table which works if no triggers are programmed.

However, I need to check the data from the inserted file for errors and display all errors found and NOT insert data into the final table, OR if no errors, print no errors found before inserting into the final table (SchoolCOA) - this is what I can't figure out.

I learned how to write error checks inside an insert stored procedure using if/else if and try/catch, but that was only for inserting data 1 line at a time. Due to the bulk insert, I tried writing the error checks inside a trigger (see trg_iu_SchoolCOA). However, when I run the usp_LoadSchoolCOA stored procedure, I always a message

The transaction ended in the trigger. The batch has been aborted

While reading about others' similar problems, it seems like triggers shouldn't be used for running error checks. So I thought maybe I could add them inside the usp_LoadSchoolCOA stored procedure by using a while loop to check for data value errors prior to merging the data into the final table - but after reading more about that, it seems while loops are just a waste of CPU and other resources.

So what is the best method for the following:

  • run a bulk insert
  • run error checks on values
    • if # of errors > 0,
      • then print all errors for the whole file (or temp table) and
      • not insert/merge any values into final table;
    • OR if # of errors = 0,
      • print statement that data checks were successfully completed,
      • then insert/merge values into final table and print total number of rows loaded ?

I have 3 tables data tables that require this same process. I'm using the shortest one as an example.

Here is the code for the tables and the bulk insert stored procedure for SchoolCOA table using temp table:

-- Create TempSchoolCOA & SchoolCOA Data Files 
CREATE TABLE dbo.TempSchoolCOA
( 
    NewOPEID char(8),
    PriorOPEID char(8),
    CodeCOA int,
    DateCOA char(8)
);

CREATE TABLE dbo.SchoolCOA 
(
    ChangeID int IDENTITY (1,1) NOT NULL PRIMARY KEY,
    NewOPEID char(8) NOT NULL CHECK (LEN(NewOPEID) = 8),
    PriorOPEID char(8) NOT NULL CHECK (LEN(PriorOPEID) = 8),
    CodeCOA int NOT NULL,
    DateCOA char(8) NOT NULL
);

-- Stored procedure to bulk insert data file into SchoolCOA
CREATE PROCEDURE usp_LoadSchoolCOA
    @FullFilePath NVARCHAR(MAX) 
AS
BEGIN
    DECLARE @sql NVARCHAR(MAX)
    TRUNCATE TABLE dbo.TempSchoolCOA

    SET @sql = N'BULK INSERT dbo.TempSchoolCOA FROM ''' + @FullFilePath + ''' WITH (FORMAT=''CSV'', CHECK_CONSTRAINTS, FIRE_TRIGGERS, FIELDTERMINATOR='','', ROWTERMINATOR=''\n'', FIRSTROW=2);'
    SELECT  @sql
    EXEC sp_executesql @sql

    MERGE INTO dbo.SchoolCOA AS TGT
        USING
            (SELECT NewOPEID, PriorOPEID, CodeCOA, DateCOA FROM dbo.TempSchoolCOA) 
            AS SRC ON (TGT.NewOPEID = SRC.NewOPEID AND TGT.PriorOPEID = SRC.PriorOPEID)
            WHEN MATCHED THEN 
                UPDATE SET  
                    TGT.NewOPEID = SRC.NewOPEID,
                    TGT.PriorOPEID = SRC.PriorOPEID,
                    TGT.CodeCOA = SRC.CodeCOA,
                    TGT.DateCOA = SRC.DateCOA
            WHEN NOT MATCHED THEN 
                INSERT (
                    NewOPEID,
                    PriorOPEID,
                    CodeCOA,
                    DateCOA
                    )
                 VALUES (
                    SRC.NewOPEID,
                    SRC.PriorOPEID,
                    SRC.CodeCOA,
                    SRC.DateCOA
                    );  
END;

-- To run, use: EXEC usp_LoadSchoolCOA @FullFilePath = 'C:\Users\Admin\Documents\TU Grad School\Case Study\SQL Work\TestSchoolCOA3.csv'

Here is the code for error catch stored procedure:

-- Error Checks for Table Inserts
CREATE PROCEDURE usp_GetErrorInfo
AS SELECT
    ERROR_NUMBER() AS ErrorNumber,
    ERROR_SEVERITY() AS ErrorSeverity,
    ERROR_STATE() AS ErrorState,
    ERROR_PROCEDURE() AS ErrorProcedure,
    ERROR_LINE() AS ErrorLine,
    ERROR_MESSAGE() AS ErrorMessage

    IF @@TRANCOUNT > 0
        ROLLBACK TRANSACTION;
GO

Here is the code for error checks for SchoolCOA data inserts that I made into a trigger, but when the usp_LoadSchoolCOA procedure is run, it results in a message

The transaction ended in the trigger. The batch has been aborted.

If it shouldn't be a trigger, how can I get these error checks to run (ideally for the entire file/TempSchoolCOA table)?

If it should be a CHECK on the table column, how do I write that for checking NewOPEID and PriorOPEID against the OPEID column in the SchoolDetails table? I guess I could just allow it to be inserted and make it a warning, but I'm not sure how to do that.

/* Validate SchoolCOA inserts with Trigger */ 
CREATE TRIGGER trg_iu_SchoolCOA 
ON SchoolCOA 
FOR INSERT, UPDATE
AS
BEGIN
    BEGIN TRY
        DECLARE @NewOPEID char(8), @PriorOPEID char(8), @CodeCOA int, @DateCOA char(8)                         

        SELECT @NewOPEID = (SELECT NewOPEID FROM inserted), 
               @PriorOPEID = (SELECT PriorOPEID FROM inserted),
               @CodeCOA = (SELECT CodeCOA FROM inserted),
               @DateCOA = (SELECT DateCOA FROM inserted)
    
        /* VALIDATE NewOPEID */     
        IF @NewOPEID NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
        BEGIN
            ROLLBACK TRANSACTION
            PRINT 'Error: NewOPEID invalid - must be 8 digits.'
            RETURN
        END
        ELSE IF NOT EXISTS (SELECT 1 FROM SchoolDetails WHERE OPEID = @NewOPEID)
        BEGIN
            ROLLBACK TRANSACTION
            PRINT 'Error: NewOPEID not found in SchoolDetails table: ' + @NewOPEID
            RETURN
        END

        /* VALIDATE PriorOPEID */   
        ELSE IF @PriorOPEID NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
        BEGIN
            ROLLBACK TRANSACTION
            PRINT 'Error: PriorOPEID invalid - must be 8 digits.' 
            RETURN
        END
        ELSE IF NOT EXISTS (SELECT 1 FROM SchoolDetails WHERE OPEID = @PriorOPEID)
        BEGIN
            ROLLBACK TRANSACTION
            PRINT 'Error: PriorOPEID not found in SchoolDetails table: ' + @PriorOPEID
            RETURN
        END

        /* VALIDATE CodeCOA */
        ELSE IF NOT EXISTS (SELECT 1 FROM CodeCOARef WHERE CodeCOA = @CodeCOA)
        BEGIN
            ROLLBACK TRANSACTION
            PRINT 'Error: CodeCOA not found in CodeCOARef table.'
            RETURN
        END

        PRINT 'Inserted/updated data successfully passed error checks for SchoolCOA table.'
    END TRY

    BEGIN CATCH
        EXECUTE usp_GetErrorInfo;
    END CATCH; 
END

And for reference, here's the School Details table:

CREATE TABLE dbo.SchoolDetails 
(
    SchoolID int IDENTITY (1,1) NOT NULL PRIMARY KEY,
    OPEID char(8) NOT NULL CHECK (LEN(OPEID) = 8),
    SchoolName varchar(80) NOT NULL,
    LocName varchar(80) NOT NULL,
    AddrLine1 varchar(100),
    AddrLine2 varchar(100),
    City varchar(50) NOT NULL,
    State2 char(2) NOT NULL CHECK (LEN(State2) = 2),
    ZipCode char(5),MainOrLoc int NOT NULL,
    OpenStatus int NOT NULL,
    StartDate char(8),
    StartReason int,
    StopDate char(8),
    StopReason int
);

Solution

  • You have some fatal flaws in your trigger:

    • A trigger must finish with the same number of transactions open as at the start. This is the cause of the spurious error due to the rollback. Instead of rolling back, throw an error using THROW, this end the trigger and will roll back the transaction automatically.
    • The error handling procedure is a really bad idea. It swallows errors allowing the code to continue, and returns errors as spurious resultsets. Remove it completely, and allow errors to do what they are supposed to do: end the procedure and show a message to the client.
    • Triggers need to handle multiple (or zero) rows, you can't assign to single variables. Instead just do one big IF EXISTS.
    CREATE OR ALTER TRIGGER trg_iu_SchoolCOA 
    ON SchoolCOA 
    FOR INSERT, UPDATE
    
    IF EXISTS (SELECT 1
        FROM inserted i
        WHERE i.NewOPEID NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
           OR NOT EXISTS (SELECT 1
            FROM SchoolDetails sd
            WHERE sd.OPEID = i.NewOPEID)
           OR i.PriorOPEID NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
           OR NOT EXISTS (SELECT 1
            FROM SchoolDetails sd
            WHERE sd.OPEID = i.PriorOPEID)
           OR NOT EXISTS (SELECT 1
            FROM CodeCOARef ccr
            WHERE ccr.CodeCOA = i.CodeCOA)
      )
        THROW 50001, N'Invalid data', 1;
    

    Having said that, do not do any of this. Triggers are not the right way to enforce these kinds of constraints, which can be instead implemented using normal built-in constraints.

    Your table should be declared like this.

    create table dbo.SchoolCOA 
    (
        ChangeID int IDENTITY (1,1) NOT NULL PRIMARY KEY,
        NewOPEID char(8) NOT NULL
          CONSTRAINT NewOPEID_valid CHECK (NewOPEID LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
          CONSTRAINT NewOPEID_SchoolDetails FOREIGN KEY REFERENCES SchoolDetails (OPEID),
        PriorOPEID char(8) NOT NULL
           CONSTRAINT PriorOPEID_valid CHECK (PriorOPEID LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
           CONSTRAINT PriorOPEID_SchoolDetails FOREIGN KEY REFERENCES SchoolDetails (OPEID),
        CodeCOA int NOT NULL
           CONSTRAINT CodeCOA_ FOREIGN KEY REFERENCES CodeCOA (CodeCOARef),
        DateCOA date NOT NULL    -- dates should be in a date column, not string
    );
    

    Ensure CHECK_CONSTRAINTS is set in the BULK INSERT statement, or these will not be enforced.