The principle of the effect of snowflake warehouse size

I'm experimenting with snowflake. I would like to know how it works as an architecture. I'm using three types of queries A, B, C with different numbers of bytes being scanned. The size of Bytes Scanned is reading Total Statistics in Profile Overview. I compared the Execution times in warehouses of size Small and size Large. When the scan size of query is small, the effect of warehouse size is small, The larger the scan size, the more 4 times the difference in warehouse size (Small 2, Large 8). Performance approaches.

I would like to know how the principle of this result.

Query	ByteScanned(MB)	Large(ms)	Small(ms)	S/L ratio
QueryA	1860	1350	2800	2.1
QueryB	6100	3800	12500	3.3
QueryC	51940	19310	77000	4.0

Solution

It's related to how Snowflake (or any Analytical MPP database) can process the data in parallel. As you know, a warehouse cluster may have multiple nodes, a small warehouse has 2, a large warehouse has 8 and so on... Each node has 8 instances to process the data in parallel. I will call them "workers" for clarity.

So a small warehouse has 28=16 workers, a large warehouse has 88=64 workers. If the data size is small (the table has fewer micro-partitions than the total workers), then your query will not use the whole warehouse resources. Some workers will stay idle during some execution steps such as reading from the table. This is why you get more performance from the large warehouse when you process larger data.