I'm using a BULK INSERT to load delimited .txt files into a staging table with 5 columns. The .txt files can sometimes contain errors and have more/less than 5 fields per line. If this happens, is it possible to detect it and cancel the entire BULK INSERT?
Each table column is of type VARCHAR. This was done because header (H01) and line (L0101, L0102, etc...) rows contain fields with different types. Because of this, setting MAXERRORS = 0 doesn't seem to be working as there are technically no syntax errors. As a result the transaction is committed, the catch block never activates and the rollback doesn't occur. Lines still get inserted into the table incorrectly shifted or bunched.
Expected .txt file format:
H01|Order|Date|Name|Address
L0101|Order|Part|SKU|Qty
L0102|Order|Part||Qty <-- Fields can be blank
L0103|Order|Part|SKU|Qty
Incorrect .txt file example:
H01|Order|Date|Name|Address
L0101|Order||Part|SKU|Qty <-- Extra field in the middle
||L0102|Order|Part|SKU|Qty <-- Extra fields at the beginning
L0103|Order|Part|SKU|Qty|| <-- Extra fields at the end
Code:
CREATE TABLE #TempStage (
Column1 VARCHAR(255) NULL
,Column2 VARCHAR(255) NULL
,Column3 VARCHAR(255) NULL
,Column4 VARCHAR(255) NULL
,Column5 VARCHAR(255) NULL
)
DECLARE
@dir SYSNAME
,@fname SYSNAME
,@SQL_BULK VARCHAR(255)
SELECT
@dir = '\\sharedfolder\'
,@fname = 'testOrder.txt'
SET @SQL_BULK =
'BULK INSERT #TempStage FROM ''' + @dir + @fname + ''' WITH
(
FIRSTROW = 1,
DATAFILETYPE=''char'',
FIELDTERMINATOR = ''|'',
ROWTERMINATOR = ''0x0a'',
KEEPNULLS,
MAXERRORS = 0
)'
BEGIN TRY
BEGIN TRANSACTION
EXEC (@SQL_BULK)
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION
END CATCH
SELECT * FROM #TempStage
DROP TABLE #TempStage
Expected output:
Column1 | Column2 | Column3 | Column4 | Column5 |
---|---|---|---|---|
H01 | Order | Date | Name | Address |
L0101 | Order | Part | SKU | Qty |
L0102 | Order | Part | NULL | Qty |
L0103 | Order | Part | SKU | Qty |
Incorrect output, would like to cancel so this doesn't happen (\ = pipe):
Column1 | Column2 | Column3 | Column4 | Column5 |
---|---|---|---|---|
H01 | Order | Date | Name | Address |
L0101 | Order | NULL | Part | SKU \ Qty |
NULL | NULL | L0102 | Order | Part \ SKU \ QTY |
L0103 | Order | Part | SKU | Qty\ | |
SQL Server 2016, 13.0.1742.0
I've created a basic way to do what I needed:
After loading the staging table, check for any instance of '|' (or whatever delimiter you're using), and raise an error if found.
IF EXISTS(SELECT * FROM #TempStage WHERE
Column1 LIKE '%|%'
OR Column2 LIKE '%|%'
OR Column3 LIKE '%|%'
OR Column4 LIKE '%|%'
OR Column5 LIKE '%|%'
)
RAISERROR('Incorrect file formatting; Pipe character found.', 16, 1);