Search code examples
snowflake-cloud-data-platformstagedata-ingestion

Ingest unstructured file into snowflake table


Have file with 200 rows, when I tried to load file into snowflake table it will print 200 rows, but I want is 1 row contains data for 200 rows.

create or replace table sample_test_single_col (LOADED_AT timestamp, FILENAME string, single_col varchar(2000) );

COPY INTO sample_test_single_col
from (
  SELECT
    CURRENT_TIMESTAMP as LOADED_AT,
    METADATA$FILENAME as FILENAME,
    s.$1 as single_col from @%table_stage s ) 
    file_format = (type = csv);

Input:-
From:- robert
Sent:   Thursday, August 03, 2006 1:15 PM
To: Jerry
Subject:    RE: Latest news
 All documents are scanned.

Desired output:-
Row LOADED_AT            FILENAME                    SINGLE_COL
1   06-06-2022 03:14    @table_stage/filename.csv    From:- robert 
                                                     Sent:  Thursday, August 03, 2006 1:15 PM
                                                     To:    Jerry
                                                     Subject:   RE: Latest news
                                                     All documents are scanned.

Current Output:-
Row LOADED_AT           FILENAME                    SINGLE_COL
1   06-06-2022 03:14    @table_stage/filename.csv   From:- robert
2   06-06-2022 03:14    @table_stage/filename.csv   Sent:   Thursday, August 03, 2006 1:15 PM
3   06-06-2022 03:14    @table_stage/filename.csv   To: Jerry
4   06-06-2022 03:14    @table_stage/filename.csv   Subject:    RE: Latest news
5   06-06-2022 03:14    @table_stage/filename.csv    All documents are scanned.

Any help will be appreciated!!


Solution

  • The parameter RECORD_DELIMITER's default value when loading data is "New line character". This is why each line becomes a new row when you load the file.

    You can set the parameter to something else (which you don't expect to have in your file):

    COPY INTO sample_test_single_col
    from (
      SELECT
        CURRENT_TIMESTAMP as LOADED_AT,
        METADATA$FILENAME as FILENAME,
        s.$1 as single_col from @mystage s ) 
        file_format = (type = csv RECORD_DELIMITER = 'NONEXISTENT');