Search code examples
sqlsql-server-2012bulkinsertnvarchar

Special characters displaying incorrectly after BULK INSERT


I'm using BULK INSERT to import a CSV file. One of the columns in the CSV file contains some values that contain fractions (e.g. 1m½f).

I don't need to do any mathematical operations on the fractions, as the values will just be used for display purposes, so I have set the column as nvarchar. The BULK INSERT works but when I view the records within SQL the fraction has been replaced with a cent symbol (¢) so the displayed text is 1m¢f.

I'm interested to understand why this is happening and any thoughts on how to resolve the issue. The BULK INSERT command is:

BULK INSERT dbo.temp FROM 'C:\Temp\file.csv' 
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' );

Solution

  • You need to BULK INSERT using the CODEPAGE = 'ACP', which converts string data from Windows codepage 1252 to SQL Server codepage.

    BULK INSERT dbo.temp FROM 'C:\Temp\file.csv' 
    WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', CODEPAGE = 'ACP');
    

    If you are bringing in UTF-8 data on a new enough version of SQL Server:

    [...] , CODEPAGE = '65001');
    

    You may also need to specify DATAFILETYPE = 'char|native|widechar|widenative'.