Search code examples
sql-servercsvbcpdata-import

BCP file import into SQL table using xml format file results in errors


BCP is something that always bugs me and after much trying I am putting up a simple test that results in a "Invalid Character Value for cast specification" error. Can't seem to pinpoint what the problem is in this.

To begin, I have a SQL SERVER table

CREATE TABLE [dbo].[customDataFromAPI](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [periodStartUTC] [datetimeoffset](7) NULL,
    [M1] [nvarchar](50) NULL,
    [M2] [nvarchar](50) NULL,
    [M3] [nvarchar](50) NULL,
    [M4] [nvarchar](30) NULL,
    [M5] [nvarchar](40) NULL,
    [M6] [nvarchar](30) NULL,
    [M7] [bigint] NULL,
    [M8] [bigint] NULL,
 CONSTRAINT [PK_customDataFromAPI] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)

I also have the following CSV file as a sample,

PeriodStart,C1,C2,C3,C4,C5,C6,C7,C8
2014-04-01T04:00:00.000Z,Kuber TEST,Another String Text Field,Quebec,Internal Server Name222,test service,TestingNowss,123178594823,12312

I am using this XML format file for bcp import:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="24"/>
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="50"/>
  <FIELD ID="3" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="50"/>
  <FIELD ID="4" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="30"/>
  <FIELD ID="5" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="40"/>
  <FIELD ID="6" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="30"/>
  <FIELD ID="7" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="50"/>
  <FIELD ID="8" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="18"/>
  <FIELD ID="9" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="18"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" NAME="periodStartUTC" xsi:type="SQLDATETIMEOFFSET"/>
  <COLUMN SOURCE="7" NAME="nbiApplicationType" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="2" NAME="applicationType" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="3" NAME="applicationProtocol" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="4" NAME="elementCluster" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="5" NAME="element" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="6" NAME="clientDevice" xsi:type="SQLNVARCHAR"/>
  <COLUMN SOURCE="8" NAME="totalBytesDown" xsi:type="SQLBIGINT"/>
  <COLUMN SOURCE="9" NAME="totalBytesUp" xsi:type="SQLBIGINT"/>
 </ROW>
</BCPFORMAT>

and I am running the following BCP command:

bcp SERVER_NAME.dbo.customDataFromAPI in C:\test_dataFile.csv -t, -f C:\test_format.xml -S SERVER_NAME -T -F2

I have tried to use SSIS and it was working well but it was extremely slow since I have about 200 million such rows to process. As an alternative I wanted to understand and put bcp to test but so far have been unable to even import a single row of data. Sometimes I also run into the infamous EOF error with BCP and have no clue what goes on.


Solution

  • Ok, this happens because bcp is actually trying to insert the first column (the sqldatetimeoffset) into the identity column as it won't skip it when you use the XML format file. The documentation says:

    With an XML format file, you cannot skip a column when you are importing directly into a table by using a bcp command or a BULK INSERT statement.

    The solution is to either use a "old style" format file that lets you define columns to skip, or make the insert through a view:

    CREATE VIEW SkipCol AS 
    SELECT periodStartUTC, M1, M2, M3, M4, M5, M6, M7, M8 FROM customDataFromAPI;
    

    and then bcp into that:

    bcp SERVER_NAME..SkipCol in C:\test_dataFile.csv -t, -f C:\test_format.xml -S SERVER_NAME -T -F2