sql-server t-sql sql-server-2008-r2 varchar nvarchar

TSQL reading from file into varchar/nvarhar error

So I decided for the fun of it to read a text file and store the contents into a NVARCHAR using TSQL and the Microsoft SQL Server Management Studio 2008 R2. I found an example for doing this at https://www.simple-talk.com/sql/t-sql-programming/the-tsql-of-text-files/

So I tried this with my ABC.txt file and its contents are:

ABCDEFGHIJKLMNOPQRSTUVWXYZ

abcdefghijklmnopqrstuvwxyz

When I first tried to store the contents of this file into@myString I used this code:

declare @myString nvarchar(max); 

Select @myString  = BulkColumn
from OPENROWSET(Bulk 'C:\Users\<myComputer'sNameHere>\Documents\How2\FilesForTestingStuff\ABC.txt', SINGLE_BLOB) as x 
print @myString;

I got this as my output when I printed the string:

䉁䑃䙅䡇䩉䱋乍偏剑呓噕塗婙਍扡摣晥桧橩汫湭灯牱瑳癵硷穹

I changed nvarchar to varchar and I got the correct contents of the file.

Anyone know why this happend? I didn't think that there's a conversion difference other than nvarchar has more space available than varchar and is able to hold unicode characters.
Also how do you normally attempt reading from a file and inserting the contents into a nvarchar?

Solution

I suppose it depends on the encoding of the input file.

You used SINGLEBLOB and according to MSDN it causes data to be returned as varbinary(MAX). Your file was probably saved using a non-unicode encoding, so when it was imported data into nvarchar column, SQL interpreted it incorrectly. Changing the type allowed characters to be read correctly. Please try to encode the file with UTF-16 and try to import data into a nvarchar(MAX) variable.

Update

I tried to recreate the issue You described. I've saved a text file with ANSI encoding, run the import script and got the output similar to the one You posted in Your question. Then, I converted the file to UCS-2 Little Endian encoding and after running the script I got correct output.

To sum it up, if You want to use importing with SINGLEBLOB option, just convert the file with data to use UCS-2 Little Endian encoding and it should work correctly with nvarchar SQL type.

Reference links: