Search code examples
sql-serverpandasbcp

Writing with bcpandas fails with errors "Invalid character value for cast specification"


I'm writing a Python Pandas dataframe to SQL Server 17 with BCP using the bcpandas package. Pandas 1.5.1, bcpandas 2.6.0. The write command looks like:

from bcpandas import SqlCreds, to_sql
to_sql(my_df, table_name, my_bcp_creds, if_exists = "replace", index = False, debug = True)

As the BCP write occurs, this message prints to the console:

SQLState = 22005, NativeError = 0 Error = [Microsoft][ODBC Driver 17 for SQL Server]Invalid character value for cast specification

It prints once for each row that BCP can't write. Eventually after 5 or 10 of these, the BCP write is aborted, having written 3k of the 80k rows in the Pandas dataframe.

How do I solve this?


Solution

  • General strategy

    First limit the number of rows of your dataframe being written with e.g., df.head(11) until you isolate the first single row where the error message gets thrown. Save it with x = df.head(11).tail(1).

    Then with this one-row dataframe, loop through the columns. In the loop, for each column, print the column name, then make a copy of the dataframe without that column and write it with BCP to a test destination table. One of these iterations should succeed without the error, indicating which column had the problematic character.

    My specific case

    In my case, the above strategy revealed that the character \n appearing in a comment string was the culprit. I removed it with a command:

    df['REVIEW COMMENTS'] = df['REVIEW COMMENTS'].str.replace('\n',' ')
    

    After which the BCP write succeeded.