Search code examples
sql-serverinsertdelimiterbulknon-printing-characters

Using a non-printable char (other then \0 \n \t) as fieldterminator argument in a bulk insert query


I cannot believe that this is not answered somewhere, but I have searched MSDN, Stack Exchange and Google. I'm beginning to suspect it is not possible the way I would want to do this.

Does anybody know a way to do this in a query (not using other tools, I have seen answers using SSIS) using SQL-server?

Say I have a CSV file containing fields terminated by SOH (StartOfHeader, char(1)) and \n als row terminator. I thought I could import it using BULK INSERT, something like:

BULK INSERT Fubar
[dbo].[Persovorlage_DE]FROM 'c:\temp\foo.bar'
WITH
(
    FIRSTROW = 2,
    FIELDTERMINATOR = char(1),
    ROWTERMINATOR = '\n',
    TABLOCK,
    CODEPAGE = '1252'
)

That would do it, except that char(1) is not accepted.

Is there any way to set FIELDTERMINATOR to char(1)?

Postscript 1:

JeroenMostert suggested in the comments to use select char(1), and copy paste it between the single quotes of FIELDTERMINATOR which is invisible but works.

Postscript 2:

HoneyBadger suggested using 0x01, which also seems to work:

BULK INSERT Fubar FROM 'c:\temp\Foo.bar'
WITH
(
    FIRSTROW = 2,
    FIELDTERMINATOR = '0x01',
    ROWTERMINATOR = '\n',
    TABLOCK,
    CODEPAGE = '1252'
)

Put it in an answer and I will accept it


Solution

  • While the docs don't mention it (or even actively deny it), you can use hexadecimal values to specify particular characters. Such as:

    BULK INSERT Fubar FROM 'c:\temp\Foo.bar'
    WITH
    (
        FIRSTROW = 2,
        FIELDTERMINATOR = '0x01',
        ROWTERMINATOR = '\n',
        TABLOCK,
        CODEPAGE = '1252'
    )
    

    Similarly, if you have a \n terminator, you can use 0x0a. If you don't, \n will be automatically prefixed with \r (although I can't recall by heart in what circumstances it does), and your terminator won't be recognized.