Search code examples
sql-servert-sqldelimiterbulkinsert

Can you cancel a BULK INSERT of all VARCHARs when a line's field count is incorrect?


I'm using a BULK INSERT to load delimited .txt files into a staging table with 5 columns. The .txt files can sometimes contain errors and have more/less than 5 fields per line. If this happens, is it possible to detect it and cancel the entire BULK INSERT?

Each table column is of type VARCHAR. This was done because header (H01) and line (L0101, L0102, etc...) rows contain fields with different types. Because of this, setting MAXERRORS = 0 doesn't seem to be working as there are technically no syntax errors. As a result the transaction is committed, the catch block never activates and the rollback doesn't occur. Lines still get inserted into the table incorrectly shifted or bunched.

Expected .txt file format:

H01|Order|Date|Name|Address
L0101|Order|Part|SKU|Qty
L0102|Order|Part||Qty            <-- Fields can be blank
L0103|Order|Part|SKU|Qty

Incorrect .txt file example:

H01|Order|Date|Name|Address
L0101|Order||Part|SKU|Qty        <-- Extra field in the middle
||L0102|Order|Part|SKU|Qty       <-- Extra fields at the beginning
L0103|Order|Part|SKU|Qty||       <-- Extra fields at the end

Code:

CREATE TABLE #TempStage (

     Column1    VARCHAR(255) NULL
    ,Column2    VARCHAR(255) NULL
    ,Column3    VARCHAR(255) NULL
    ,Column4    VARCHAR(255) NULL
    ,Column5    VARCHAR(255) NULL
)

DECLARE 
     @dir           SYSNAME
    ,@fname         SYSNAME
    ,@SQL_BULK      VARCHAR(255)

SELECT 
     @dir           =   '\\sharedfolder\'
    ,@fname         =   'testOrder.txt'


SET @SQL_BULK = 
        'BULK INSERT #TempStage FROM ''' + @dir + @fname + ''' WITH
        (
            FIRSTROW = 1,
            DATAFILETYPE=''char'',
            FIELDTERMINATOR = ''|'',
            ROWTERMINATOR = ''0x0a'',
            KEEPNULLS,
            MAXERRORS = 0
        )'

BEGIN TRY

    BEGIN TRANSACTION

        EXEC (@SQL_BULK)

    COMMIT TRANSACTION

END TRY

BEGIN CATCH

    ROLLBACK TRANSACTION

END CATCH

SELECT * FROM #TempStage
DROP TABLE #TempStage

Expected output:

Column1 Column2 Column3 Column4 Column5
H01 Order Date Name Address
L0101 Order Part SKU Qty
L0102 Order Part NULL Qty
L0103 Order Part SKU Qty

Incorrect output, would like to cancel so this doesn't happen (\ = pipe):

Column1 Column2 Column3 Column4 Column5
H01 Order Date Name Address
L0101 Order NULL Part SKU \ Qty
NULL NULL L0102 Order Part \ SKU \ QTY
L0103 Order Part SKU Qty\ |

SQL Server 2016, 13.0.1742.0


Solution

  • I've created a basic way to do what I needed:

    After loading the staging table, check for any instance of '|' (or whatever delimiter you're using), and raise an error if found.

    IF EXISTS(SELECT * FROM #TempStage WHERE
            Column1 LIKE '%|%'
            OR Column2 LIKE '%|%'
            OR Column3 LIKE '%|%'
            OR Column4 LIKE '%|%'
            OR Column5 LIKE '%|%'
    )
    RAISERROR('Incorrect file formatting; Pipe character found.', 16, 1);