Search code examples
sqlsql-serverimportbcp

Bulk insert not recognizing row terminators


I'm preparing format files to import text-qualified files into sql-server based on This article

Sample of data for import:

"1000000"|"1100000"|"2017-02-26 00:00:00"|"CAT1"|"Item from CAT1"
"1000001"|"1100000"|"2017-02-26 00:00:01"|"CAT2"|"Item from CAT2"
"1000002"|"1100001"|"2017-02-26 00:01:02"|"CAT2"|"Item from CAT2"
"1000003"|"1100002"|"2017-02-26 01:02:03"|"CAT3"|"Item from CAT3"

My format file:

13.0
6
1       SQLCHAR             0       0       "\""        0     FIRST_QUOTE                     SQL_Latin1_General_CP1_CI_AS
2       SQLINT              0       4       "\"|\""     1     transaction_id                  ""
3       SQLINT              1       4       "\"|\""     2     user_id                         ""
4       SQLDATETIME         0       8       "\"|\""     3     create_date                     ""
5       SQLCHAR             2       10      "\"|\""     4     category                        SQL_Latin1_General_CP1_CI_AS
6       SQLCHAR             2       50      "\"\r\n"    5     item                            SQL_Latin1_General_CP1_CI_AS

Which results in:

The bulk load failed. The column is too long in the data file for row 1, column 6. Verify that the field terminator and row terminator are specified correctly.

I'm fairly certain that file contains \r\n (Checked with Hex editor shows 0x0d,0x0a), although ignoring text qualifiers and format file i was able to import it manually only with

Bulk insert <table_name> from '\\path\to\file' with (fieldterminator='|', rowterminator='\n')

Solution

  • I tried poking at your format file, but it just confirmed that I am not any good with that version of the format file.

    Switching over to an xml format file was easy enough though.

    cat.xml:

    <?xml version="1.0"?>
    <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <RECORD>
    <FIELD ID="1" xsi:type="CharTerm" TERMINATOR='"'     COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
    <FIELD ID="2" xsi:type="CharTerm" TERMINATOR='"|"'   COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
    <FIELD ID="3" xsi:type="CharTerm" TERMINATOR='"|"'   COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
    <FIELD ID="4" xsi:type="CharTerm" TERMINATOR='"|"'   COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
    <FIELD ID="5" xsi:type="CharTerm" TERMINATOR='"|"'   COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
    <FIELD ID="6" xsi:type="CharTerm" TERMINATOR='"\r\n' COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
    </RECORD>
    <ROW>
    <COLUMN SOURCE="2" NAME="transaction_id" xsi:type="SQLVARYCHAR"/>
    <COLUMN SOURCE="3" NAME="user_id" xsi:type="SQLVARYCHAR"/>
    <COLUMN SOURCE="4" NAME="create_date" xsi:type="SQLVARYCHAR"/>
    <COLUMN SOURCE="5" NAME="category" xsi:type="SQLVARYCHAR"/>
    <COLUMN SOURCE="6" NAME="item" xsi:type="SQLVARYCHAR"/>
    </ROW>
    </BCPFORMAT>
    

    sql:

    bulk insert dbo.cat
       from 'c:\cat.txt'
       with (
          formatfile = 'c:\cat.xml'
        , firstrow   = 1
      );
    

    results:

    enter image description here

    input file, showing row terminator in notepad++:

    enter image description here