Search code examples
pythonpandassnowflake-cloud-data-platform

How to insert NULL into Snowflake string column with pandas?


I am using the pandas_tools package to write data to my Snowflake table.

create table TEST_TBL(
    ID integer,
    VALUE string
)
import pandas as pd
from snowflake.connector import pandas_tools as pt

df = pd.DataFrame(data={'ID': [1, 2],
                        'VALUE': ['test', None]
                       })

# assume everything else is set up previously
pt.write_pandas(conn=conn, 
                df=df, 
                table_name=table_name,
                database=database,
                schema=schema)

Using write_pandas(), instead of inserting a NULL value in the VALUE column where ID=2 (which is what I would expect to happen with a python NoneType), it instead writes the literal string "None" into the table.

Snowflake table

I have tried it with np.nan as well and that also writes a literal string "NaN" into the column. How do I get it to insert a NULL value into the record?

Sample of expected null outputs from another table.

Sample data from another table


Solution

  • Your code appears to be correct.

    import snowflake.connector
    import pandas as pd
    from snowflake.connector import pandas_tools as pt
    
    conn = snowflake.connector.connect(**connection_parameters)
    cursor = conn.cursor()
    database = connection_parameters['database']
    schema = connection_parameters['schema']
    table_name = 'TEST_TBL'
    
    sql = f'create table {table_name}(ID integer,VALUE string)'
    cursor.execute(sql)
    
    df = pd.DataFrame(data={'ID': [1, 2],
                            'VALUE': ['test', None]
                           })
    
    # assume everything else is set up previously
    pt.write_pandas(conn=conn, 
                    df=df, 
                    table_name=table_name,
                    database=database,
                    schema=schema)
    
    sql = f'select * from {table_name}'
    cursor.execute(sql)
    for (ID, VALUE) in cursor:
       print('ID: {0}, VALUE: {1}'.format(ID, VALUE))
    
    ID: 1, VALUE: test
    ID: 2, VALUE: None
    

    enter image description here