So I decided for the fun of it to read a text file and store the contents into a NVARCHAR
using TSQL and the Microsoft SQL Server Management Studio 2008 R2. I found an example for doing this at https://www.simple-talk.com/sql/t-sql-programming/the-tsql-of-text-files/
So I tried this with my ABC.txt
file and its contents are:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
When I first tried to store the contents of this file into@myString
I used this code:
declare @myString nvarchar(max);
Select @myString = BulkColumn
from OPENROWSET(Bulk 'C:\Users\<myComputer'sNameHere>\Documents\How2\FilesForTestingStuff\ABC.txt', SINGLE_BLOB) as x
print @myString;
I got this as my output when I printed the string:
䉁䑃䙅䡇䩉䱋乍偏剑呓噕塗婙扡摣晥桧橩汫湭灯牱瑳癵硷穹
I changed nvarchar
to varchar
and I got the correct contents of the file.
Anyone know why this happend? I didn't think that there's a conversion difference other than nvarchar
has more space available than varchar
and is able to hold unicode characters.
Also how do you normally attempt reading from a file and inserting the contents into a nvarchar
?
I suppose it depends on the encoding of the input file.
You used SINGLEBLOB
and according to MSDN it causes data to be returned as varbinary(MAX)
. Your file was probably saved using a non-unicode encoding, so when it was imported data into nvarchar
column, SQL interpreted it incorrectly. Changing the type allowed characters to be read correctly. Please try to encode the file with UTF-16 and try to import data into a nvarchar(MAX) variable.
Update
I tried to recreate the issue You described. I've saved a text file with ANSI encoding, run the import script and got the output similar to the one You posted in Your question. Then, I converted the file to UCS-2 Little Endian encoding and after running the script I got correct output.
To sum it up, if You want to use importing with SINGLEBLOB
option, just convert the file with data to use UCS-2 Little Endian encoding and it should work correctly with nvarchar
SQL type.
Reference links: