Search code examples
sqlsql-serverdatabasebulkinsertmap-files

Bulk Insert with format file NOT skipping column in destination table with 146 fields as it should be


Here is the full error:

Msg 4864, Level 16, State 1, Line 3 Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 5 (FK_User_CreatedBy).

And here is the existential snapshot of my pain :)

enter image description here

Many questions touch on these issues, but none of them do the trick...

I suspect my problem is something like described here, but I am not sure. The destination table column that is not being skipped properly is NOT sparse.

enter image description here

Here is the two row data file for import (.csv) open in notepad and notepad++: (Yes I am aware that the row terminator is \r\n and the field/column terminator is \t or ',')

enter image description here

Here it is in plain text:

1,fArty,Padul,1,10/1/1962,Head of ,Australia,AU Talavera Centre,NSW,7 CSU,[email protected]
2,mifsm,Jodel,1,10/1/1970,Chief Officer,Australia,AU ,NSW,8 CSU,[email protected]

CONTEXT/BACKGROUND: Test on Small Table and Input File with Few Records (remember it is column skipping on a table with many columns that ends up hurting)...

The import WORKS perfectly for a small database table that looks like this:

enter image description here

And is created thus:

enter image description here

Here is the code for the table create:

    DROP TABLE dbo.tbl_Person_Importtest

CREATE TABLE dbo.tbl_Person_Importtest 
(
ID int PRIMARY KEY NOT NULL,
LastName varchar(100) NOT NULL, 
FirstName varchar(100) NOT NULL, 
FK_Gender varchar(4) NOT NULL, 
DateOfBirth date NOT NULL, 
JobTitle varchar(200) NOT NULL, 
Address1Country varchar(50) NOT NULL, 
Location varchar(200) NOT NULL,
Address1StateOrProvince varchar(50) NOT NULL, 
Department varchar(200) NOT NULL, 
EMailAddress1  varchar(200) NOT NULL
)  

The .xml bulk insert format file looks like this:

enter image description here

Note that it also works if I skip the ID (PK + index) column since the database table is empty and the import file does not have an index. This is working fine for the small destination table, as the database is generating the primary key index.

Here the format file as text ():

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="12"/>
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="100" COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="3" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="100" COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="4" COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="5" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="11"/>
  <FIELD ID="6" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="200" COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="7" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="8" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="200" COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="9" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="10" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="200" COLLATION="Latin1_General_CI_AS"/>
  <FIELD ID="11" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="200" COLLATION="Latin1_General_CI_AS"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" NAME="ID" xsi:type="SQLINT"/>
  <COLUMN SOURCE="2" NAME="LastName" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="3" NAME="FirstName" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="4" NAME="FK_Gender" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="5" NAME="DateOfBirth" xsi:type="SQLDATE"/>
  <COLUMN SOURCE="6" NAME="JobTitle" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="7" NAME="Address1Country" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="8" NAME="Location" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="9" NAME="Address1StateOrProvince" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="10" NAME="Department" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="11" NAME="EMailAddress1" xsi:type="SQLVARYCHAR"/>
 </ROW>
</BCPFORMAT>

And it was created using bcp at the command line like this:

enter image description here

Here is the bcp command line in text:

bcp YFP..tbl_Person_Importtest format nul -f PersonImportMapFile.xml -c -x -T

Now when I execute the import with all of these files against the empty small table, all is good:

enter image description here

If I insert more rows again, no problem... enter image description here enter image description here

THE LARGE TABLE I cannot include a full description due to intellectual property issues, but the large destination table has 146 fields with no sparse fields and plenty of DATETIME and DATE fields, as well as stacks of foreign keys (mostly INT) some of which are nullable. Here is the map file as generated by bcp (With Field names truncated and some removed):

   CREATE TABLE [dbo].[tbl_Person](
[ID] [int] IDENTITY(1,1) NOT NULL,
[RecordTitle] [nvarchar](250) NULL,
[SecurityCode] [nvarchar](250) NULL,
[DateCreated] [smalldatetime] NOT NULL,
[FK_User_CreatedBy] [int] NULL,
[wning] [int] NULL,
[ssigned] [int] NULL,
[ollowup] [int] NULL,
[sation_Owning] [int] NULL,
[wning] [int] NULL,
[pdate] [smalldatetime] NULL,
[astUpdate] [int] NULL,
[tatus] [bit] NULL,
[ive] [smalldatetime] NULL,
[eason] [nvarchar](250) NULL,
[tatus] [bit] NULL,
[ion] [smalldatetime] NULL,
[Titles] [int] NULL,
[LastName] [varchar](50) NOT NULL,
[FirstName] [varchar](50) NOT NULL,
[MiddleName] [nvarchar](50) NULL,
[DateOfBirth] [datetime] NULL,
[ion] [ntext] NULL,
[Code] [nvarchar](50) NULL,
[r] [int] NULL,
[] [nvarchar](100) NULL,
[nt] [nvarchar](100) NULL,
[ame] [nvarchar](100) NULL,
[hone] [nvarchar](50) NULL,
[tName] [nvarchar](100) NULL,
[e1] [nvarchar](50) NULL,
[e2] [nvarchar](50) NULL,
[one1] [nvarchar](50) NULL,
[Moe2] [nvarchar](50) NULL,
[Fax] [nvarchar](50) NULL,
[e1] [nvarchar](250) NULL,
[e2] [nvarchar](250) NULL,
[e3] [nvarchar](250) NULL,
[Address1CityOrSuburb] [nvarchar](50) NULL,
[Address1StateOrProvince] [nvarchar](50) NULL,
[Address1Country] [nvarchar](50) NULL,
[Address1PostalCode] [nvarchar](20) NULL,
[Line1] [nvarchar](250) NULL,
[Line2] [nvarchar](250) NULL,
[Line3] [nvarchar](250) NULL,
[CityOrSuburb] [nvarchar](50) NULL,
[StateOrProvince] [nvarchar](50) NULL,
[Country] [nvarchar](50) NULL,
[PostalCode] [nvarchar](20) NULL,
[RL] [nvarchar](200) NULL,
[ress1] [nvarchar](100) NULL,
[ress2] [nvarchar](100) NULL,
[ne] [bit] NULL,
[] [bit] NULL,
[il] [bit] NULL,
[tail] [bit] NULL,
[kEl] [bit] NULL,
[kPalMail] [bit] NULL,
[dMM] [bit] NULL,
[_Preferred] [int] NULL,
[] [int] NULL,
[onStatus] [int] NULL,
[1] [money] NULL,
[2] [money] NULL,
[3] [money] NULL,
[4] [money] NULL,
[5] [money] NULL,
[6] [money] NULL,
[ncome] [money] NULL,
[rInc1] [money] NULL,
[rInc2] [money] NULL,
[rInc3] [money] NULL,
[rInc4] [money] NULL,
[rInc5] [money] NULL,
[rInc6] [money] NULL,
[artner] [money] NULL,
[1] [money] NULL,
[2] [money] NULL,
[3] [money] NULL,
[4] [money] NULL,
[5] [money] NULL,
[6] [money] NULL,
[7] [money] NULL,
[8] [money] NULL,
[ud] [money] NULL,
[] [money] NULL,
[] [money] NULL,
[] [money] NULL,
[] [money] NULL,
[] [money] NULL,
[] [money] NULL,
[lAss] [money] NULL,
[1] [money] NULL,
[2] [money] NULL,
[3] [money] NULL,
[4] [money] NULL,
[5] [money] NULL,
[lDebt] [money] NULL,
[rganisation_Provider] [int] NULL,
[Insurance] [money] NULL,
[ver] [money] NULL,
[itd] [nvarchar](250) NULL,
[veod] [nvarchar](250) NULL,
[fiNominated] [bit] NULL,
[ [money] NULL,
[idD] [nvarchar](50) NULL,
[ccs] [int] NULL,
[mpus] [int] NULL,
[ry] [money] NULL,
[feInsurance] [bit] NULL,
[Cor] [bit] NULL,
[ov] [money] NULL,
[DCer] [bit] NULL,
[mous] [int] NULL,
[iftatus] [int] NULL,
[PCos] [int] NULL,
[PDCus] [int] NULL,
[ersned] [int] NULL,
[ueKey] [uniqueidentifier] NULL,
[rified] [bit] NULL,
[Actr] [smalldatetime] NULL,
[embpe] [int] NULL,
[etAult] [money] NULL,
[t7] [money] NULL,
[t8] [money] NULL,
[6] [money] NULL,
[7] [money] NULL,
[8] [money] NULL,
[onalScore] [nvarchar](10) NULL,
[] [int] NULL,
[rganment] [int] NULL,
[rac] [int] NULL,
[kerpdate] [datetime] NULL,
[keriew] [datetime] NULL,
[ari] [int] NULL,
[] [int] NULL,
[Q1] [int] NULL,
[Q2] [int] NULL,
[Q3] [int] NULL,
[Q4] [int] NULL,
[Q5] [int] NULL,
[Q6] [int] NULL,
[Q7] [int] NULL,
[Q8] [int] NULL,
[Q9] [int] NULL,
[Q10] [int] NULL,
 CONSTRAINT [PK_tbl_Person] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Outcome

I should be able to import into this table using the same data file as I specified for the example with the smaller table above, but I am getting the error specified at the beginning of this question.

The field it is picking on is indeed the fifth field/column in the table, but it is supposed to be skipping to the fields named in the map only, according to this MS tutorial.

IT just looks like I will need to use a staging table or other programmatic approach wit middleware or SQLBulkCopy (c# .NET), and I would prefer not to do this at this stage. I would just like the map file to work.

Did I miss something, or is it a case of shoot the BULK INSERT with-map-file for-large-table horse and get a different ride?


Solution

  • What perhaps you have missed is that the example in the tutorial which uses an XML format file to skip columns inserts data into a view that includes only the target columns; it doesn't appear to be possible to use an XML format file to skip columns in the target table.

    You could create a view of the relevant columns on tbl_person and insert to that.

    Alternatively you could use an old-style non XML format file, or (perhaps easier, if your environment security setting allow it) use OPENROWSET(BULK...) - both of these options are covered in the tutorial.

    There are a few other things you might consider changing:

    1 - The sample wide table definition doesn't match your input file in a couple of ways:

    • There is no source in your file for the NOT NULL column DateCreated - you might need a DEFAULT constraint to set the value - perhaps this is present but was omitted from the example table definition?
    • The table contains no FK_Gender,JobTitle,Location, Department, or EMailAddress1 columns, even though they are referenced format file - this might be a side-effect of your redaction of the column names.

    2 - You can generate a format file which more closely matches your data file using a command like the one below, which correctly sets the field terminator to a comma:

    bcp YFP..tbl_Person_Importtest format nul -f c:\temp\so.bcp.gen.test.fmt -c -x -T -t ,
    

    3 - The screenshot of your BULK INSERT command includes the command:

    SET IDENTITY INSERT <table> OFF
    

    before the bulk insert. There are two issues with this:

    • This doesn't do anything in the context of a BULK INSERT command, where you'd use the KEEPIDENTITY option.

    • Setting IDENTITY INSERT OFF disables the insertion of identity values (i.e. the normal behaviour). If you use the OPENROWSET(BULK...) method, you'll need to set IDENTITY INSERT ON before the command runs to enable identity insertion, then IDENTITY INSERT OFF after the command completes.