Search code examples
pythonanacondasnowflake-cloud-data-platformlibraries

How are anaconda packages "deployed" when using snowpark in my UDF or proc


I'd like to gain a deeper understanding of the implications of accepting Anaconda's terms while working with Snowpark. Specifically, I'm curious about what happens to Anaconda packages in the context of Snowflake. Does accepting the Anaconda terms lead to the physical installation of these packages within my Snowflake account? Additionally, I'm interested in whether this installation process occurs before accepting the terms or after.


Solution

  • Anaconda packages are only installed when you run a query that requires them. For example, if you create Python UDF "foo" that requires packages A and B, and Python UDF "bar" that requires packages C and D, then when you run a query like this:

    SELECT foo(x), bar(y) FROM T;
    

    The packages A, B, C, and D will be installed on the warehouse node(s) that run the query. The packages are cached, so if a subsequent query in the same warehouse uses those same packages, they won't have to be reinstalled.

    TL;DR When you accept the Anaconda terms, it's effectively just a metadata operation to configure the setting that allows your account to use the packages.

    In case it's useful, I go through a similar explanation in this YouTube video: https://youtu.be/tT0jCX_Bjok?si=x-cGPXhtKEupR3lz&t=281