I've been following an Azure analytics tutorial and I'm trying to ingest a CSV file into an Azure Analytics Workspace. But when I try to group by a column in the SQL it throws an error. The SQL is this:
-- This is auto-generated code
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://daveslakestore.dfs.core.windows.net/daves-fs/products.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
) AS [result];
and the error generated is:
potential conversion error while reading VARCHAR column 'ProductName' from UTF8 encoded text. Change database collation to a UTF8 collation or specify explicit column schema in WITH clause and assign UTF8 collation to VARCHAR columns [etc]...
I've determined the collation by executing:
SELECT DATABASEPROPERTYEX(DB_NAME(), 'Collation') AS Collation;
which returns 'SQL_Latin1_General_CP1_CI_AS'
I assume this should be UTF-8 not Latin and Googling suggests this is a known error that requires me to alter the collation, but I have been unable to see how. I've tried without success to do the SQL below but I don't know what my database name is, or even if this is the right strategy?:
USE master;
GO
ALTER DATABASE MyOptionsTest
COLLATE Latin1_General_100_BIN2_UTF8 ; --?
GO
SELECT name, collation_name
FROM sys.databases
WHERE name = N'MyOptionsTest'; —what should this really be?
GO
Your file probably hasn't a header row in Synapse. First see the columns:
-- This is auto-generated code
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://daveslakestore.dfs.core.windows.net/daves-fs/products.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0'
) AS [result]
and give name to columns that you need:
-- This is auto-generated code
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://daveslakestore.dfs.core.windows.net/daves-fs/products.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0'
)
WITH(
columnname1 INT 2, --2nd column
columnname2 VARCHAR(50) 5, --5nd column
columnname3 VARCHAR(50) 10 --10th column
)AS [result]
If you want to view with column header, don't change your code. Just go to file in synapse, right click-> preview-> with column header -ON -> OK.