Search code examples
pythondatabrickssnowflake-cloud-data-platformpython-cryptography

How to connect to Snowflake using python snowflake connector from within Databricks in Python 3?


When I try to attach the snowflake-sqlalchemy library to a Python 3 cluster in Databricks it breaks my python build and it gives me the following error when I install subsequent libraries:

AttributeError: cffi library '_openssl' has no function, constant or global variable named 'Cryptography_HAS_ED25519'

I have tried attaching the latest version of the Cryptography library to the cluster separately however this gave me the same issue. I think it might be related to the following links:

connecting-to-snowflake-from-azure-databricks-notebook-message-openssl-has-no-function-constant-or-global-variable-named-cryptography

https://github.com/snowflakedb/snowflake-connector-python/issues/32

In the second link it mentions a workaround:

The workaround is:
Uninstall cryptography by running pip uninstall cryptography
Delete the directory .../site-packages/cryptography/ manually
Reinstall snowflake-connector-python

Looks like the directory structure of cryptography changed since 1.7.2.*

Is there any way to uninstall the pre-installed cryptography 1.5 python library within Databricks so that I can reinstall the latest version of cryptography (2.5) with the new directory structure?


Solution

  • I have found an answer to my problem.

    The issue is caused by the version of openssl in Databricks being too out of date for snowflake-sqlalchemy to work with it.

    The solution is as follows:

    1. Upgrade PIP

      %sh /databricks/python/bin/pip install --upgrade pip

    2. Uninstall pyopenssl

      %sh /databricks/python/bin/pip uninstall pyopenssl -y

    3. Install pyopenssl

      %sh /databricks/python/bin/pip install --upgrade pyopenssl

    4. Install snowflake-sqlalchemy

      %sh /databricks/python/bin/pip install --upgrade snowflake-sqlalchemy

    The answer to this question was helpful: Python AttributeError: 'module' object has no attribute 'SSL_ST_INIT'

    I have created an init file using the following code:

    dbutils.fs.mkdirs("dbfs:/databricks/init/")
    
    dbutils.fs.put("dbfs:/databricks/init/sf-initiation.sh" ,"""
    #!/bin/bash
    /databricks/python/bin/pip install --upgrade pip
    /databricks/python/bin/pip uninstall pyopenssl -y
    /databricks/python/bin/pip install --upgrade pyopenssl
    /databricks/python/bin/pip install --upgrade snowflake-sqlalchemy
    """, True)
    

    The last command in the file updates all outdated packages as in: Upgrading all packages with pip