Search code examples
snowflake-cloud-data-platformclouddata-warehousedata-partitioning

What is hybrid-columnar storage?


Snowflake stores data using a hybrid-columnar storage method. I understand what columnar storage is and its benefits, but what does the hybrid mean? Is this simply referring to Snowflake accessing blob storage from different cloud providers?


Solution

  • Snowflake uses PAX [Partition Attributes Across] aka hybrid columnar storage meaning:

    • Horizontal and vertical partitioning
    • Automatic column level and partition level compression
    • Natural (built-in) data clustering based on ingestion date (order)

    This was part of the SIGMOD 2016 presentation, available here.

    A paper on PAX can be found here.