Search code examples
sqlazureavrosnowflake-cloud-data-platform

What is the right way to get Avro-files containing JSON into table-structure on Snowflake?


I've been struggling to get my data from Azure Event Hub into SQL-table on Snowflake-platform. I just can't wrap my head around how to do it properly if I have to transform the data multiple times. My data is in the body of the Avro-file.

I just started doing Snowflake. So far I've tried to follow this tutorial on the subject but it doesn't actually save the JSON-formatted body anywhere in the video. So far I've tried something like this

CREATE DATABASE IF NOT EXISTS MY_DB;
USE DATABASE MY_DB;
CREATE OR REPLACE TABLE data_table(
  "column1" STRING,
  "column2" INTEGER,
  "column3" STRING
);
create or replace file format av_avro_format
  type = 'AVRO'
  compression = 'NONE';

create or replace stage st_capture_avros
  url='azure://xxxxxxx.blob.core.windows.net/xxxxxxxx/xxxxxxxxx/xxxxxxx/1/'
  credentials=(azure_sas_token='?xxxxxxxxxxxxx')
  file_format = av_avro_format;

copy into avro_as_json_table(body)
from(
    select(HEX_DECODE_STRING($1:Body))
    from @st_capture_avros
);

copy into data_table("Column1", "Column2", "Column3" )
from(

  select $1:"jsonKeyValue1", $1:"jsonKeyValue2", $1:"jsonKeyValue3"
  from avro_as_json_table
);

This doesn't work as it produces "SQL compilation error: COPY statement only supports simple SELECT from stage statements for import" error and I know I should use INSERT INTO in the last statement instead of copy but my question is more to do how would I eliminate redundant avro_as_json_table from the equation?


Solution

  • Rather than using

    copy into avro_as_json_table(body)
    from ...
    

    try

    INSERT INTO avro_as_json_table(body)
    from ...