Search code examples
sql-serverxmlcharacter-encodingbcp

Using BCP to create XML files produces files with invalid encoding for non-ASCII characters


Reproducing issue, SQL code:

CREATE TABLE dbo.FooTable(SomeString NVARCHAR(100));
INSERT INTO dbo.FooTable(SomeString)
VALUES('Degree symbol is °');
DECLARE @Code NVARCHAR(4000) = N'BCP "SELECT (SELECT SomeString FROM dbo.FooTable FOR XML PATH(''Foo''), ROOT(''BAR''), TYPE )"  QUERYOUT "F:\Output\File.XML" -c -C RAW -T ';

EXEC xp_cmdshell @Code;

DROP TABLE dbo.FooTable;

It creates file with following content:

<BAR><Foo><SomeString>Degree symbol is °</SomeString></Foo></BAR>

Such files are not recognized as valid XML files by Internet Explorer and Firefox (Chrome yields error error on line 1 at column 23: Encoding error). If I open them with Notepad and save as Unicode (little endian) - it opens and validates.

Update:

changing bcp options to -T -w -r -t seems to make XML valid for most XML consumers, but not Internet Explorer.


Solution

  • Try this one -

    IF OBJECT_ID('tempdb.dbo.##t') IS NOT NULL
        DROP TABLE ##t
    
    SELECT x = (
        SELECT x
        FROM (
            VALUES (N'Degree symbol is °')
        ) t(x)
        FOR XML PATH('Foo'), ROOT('BAR'), TYPE
    )
    INTO ##t
    
    DECLARE @sql NVARCHAR(4000) = 'bcp "SELECT * FROM ##t" queryout "D:\sample.xml" -S ' + @@servername + ' -T -w -r -t'
    EXEC sys.xp_cmdshell @sql
    

    Just interesting where the root of your issue...