Search code examples
compressionsnowflake-cloud-data-platform

Compression analysis in snowflake


In redshift, we can run "analyze compression " . This gives us information on what % compression happened for each column. Do we have anything similar to that command in snowflake?


Solution

  • One of key concepts of Snowflake is automatic storage handling:

    Database Storage

    When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage.

    Snowflake manages all aspects of how this data is stored — the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake. The data objects stored by Snowflake are not directly visible nor accessible by customers; they are only accessible through SQL query operations run using Snowflake.

    Benefits of Micro-partitioning

    The benefits of Snowflake’s approach to partitioning table data include:

    • As the name suggests, micro-partitions are small in size (50 to 500 MB, before compression), which enables extremely efficient DML and fine-grained pruning for faster queries.

    ...

    • Columns are also compressed individually within micro-partitions. Snowflake automatically determines the most efficient compression algorithm for the columns in each micro-partition.

    If the goal of "ANALYZE COMPRESSION" is to find potential disk reduction gains that process is already done.