Search code examples
snowflake-cloud-data-platformsnowflake-stage

Ignore different files format from stage select in Snowflake


I have a stage in Snowflake which refers to a S3 bucket with other files than JSON (HTML too).

When I select from this stage I get this error:

SELECT s.$1 FROM @RAW.DEV.STAGE_LOCAL (file_format => 'JSON') AS s

Error:

SQL Error [100069] [22P02]: Error parsing JSON: invalid character outside of a string: '<'
  File 'local/2023-04-10-16/htmlreport.html', line 1, character 1
  Row 0, column $1

because it's reading the HTML and JSON files.

Is there some way to ignore those HTML files in the selection?


Solution

  • You could use PATTERN:

    PATTERN => 'regex_pattern'

    A regular expression pattern string, enclosed in single quotes, specifying the file names and/or paths on the external stage to match.

    For the best performance, try to avoid applying patterns that filter on a large number of files.

    SELECT s.$1 
    FROM @RAW.DEV.STAGE_LOCAL (file_format => 'JSON', PATTERN => '.*[.]json') AS s