This query used to work in Azure Data Factory pipeline but stopped working few days ago. Nothing changed in case of file names/ formats etc in Azure Blob storage. Getting error in this line:
SELECT * FROM OPENROWSET (
BULK
'/filepath.csv@snapshot=*', FORMAT = 'CSV'
)
The error says .csv@snapshot=* has URL suffix which is not allowed. Full code:
-- CREATE OR REPLACE VIEW clean.barriers AS
IF EXISTS (SELECT * FROM sys.tables t
JOIN sys.schemas s ON (t.schema_id = s.schema_id)
WHERE s.name = 'clean' AND t.name = 'barriers')
EXEC('DROP EXTERNAL TABLE [clean].[barriers]')
CREATE EXTERNAL TABLE [clean].[barriers]
WITH
(
LOCATION = 'clean/synapse/barriers',
DATA_SOURCE = "",
FILE_FORMAT = [SynapseParquetFormat]
)
AS
SELECT * FROM OPENROWSET (
BULK
'/filepath.csv@snapshot=*', FORMAT = 'CSV'
)
WITH(
-- Schema adjusted to what we have in clean/barriers in Bigquery
mshp_id INT,
prog_name NVARCHAR(256),
barrier_name NVARCHAR(256),
days INT
) AS load_clean_data
As per the Official Documentation, you should have a Data source for the source file also from which you are trying to copy the data.
So, try to create a data source for the source CSV file and check, it may work.
Also, as you are executing the above script using ADF, first try to execute it without ADF and if the error occurs then problem can be with the script not ADF. If not try to change the activity of ADF and check.
You can try this trouble shoot also in your BULK path. As you want the data from that csv files folder give the path like below and check.
/folder/*.csv