Search code examples
csvt-sqlgit-bashbcp

bulk copy utility is ignoring first data row in source CSV file


While trying to figure out how the bulk copy utility works so it can be applied to a large dataset, I discovered an issue - the first row of data does not get loaded into the database table. Here is the simple example I'm using to learn how bcp works -

DDL:

use sandbox;

drop table dbo.people;

create table dbo.people
( 
  PersonID smallint NOT NULL,
  FullName varchar(60) NOT NULL,
  PerformanceRating varchar(1) NOT NULL
);

PeopleTableFormat.xml:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR=",&quot;" MAX_LENGTH="7"/>
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="&quot;," MAX_LENGTH="60" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
  <FIELD ID="3" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" NAME="PersonID" xsi:type="SQLSMALLINT"/>
  <COLUMN SOURCE="2" NAME="FullName" xsi:type="SQLVARYCHAR"/>
  <COLUMN SOURCE="3" NAME="PerformanceRating" xsi:type="SQLVARYCHAR"/>
 </ROW>
</BCPFORMAT>

LoadPeopleDataFromCsvFile.sh:

#!/bin/bash
bcp dbo.people in ./data.csv -S MyServerName -E -d sandbox -T -c -t "," -F 1 -e errors.csv -f ./PeopleTableFormat.xml

data.csv:

PersonID,FullName,PerformanceRating
107,"Brown, George",A
128,"White, Amanda",B
133,"Green, Greg",B

The data loading process is triggered by calling the shell script from Git Bash. The terminal output is this:

enter image description here

Here is a screenshot of the data that loaded:

enter image description here

Here is what errors.csv contains:

enter image description here

I'm not finding this error message helpful. Why isn't the first row of data being loaded?


Solution

  • To correct the issue of the first data row not being imported, the column header row in data.csv had to be updated to match the terminators specified in the XML format file:

    PersonID,"FullName",PerformanceRating
    

    To correct 2 console errors the argument supplied for the -F switch in LoadPeopleDataFromCsvFile.sh had to be changed to import data starting on row 2 of the CSV file and the -c switch had to be removed:

    bcp dbo.people in ./data.csv -S MyServerName -E -d sandbox -T -t "," -F 2 -e errors.csv -f ./PeopleTableFormat.xml