Search code examples
pandaspymysql

column width not sufficient when copying dataframe to sql


I have the following line in the code that works great!

df.to_sql(con=connection, name='my_tbl_name', if_exists='replace', flavor='mysql')

But the problem is that it creates column with varchar(63) and the data is more than 63 bytes long.

I get this warning:

/home/anaconda/lib/python2.7/site-packages/pymysql/cursors.py:134: Warning: Data truncated for column '2' at row 1
  result = self._query(query)

How do I make sure that pandas create a table with varchar(255) instead of 63?


Solution

  • Supposing you are using a version of pandas >= 0.14, you should use sqlalchemy to make the database connection (using a DBAPI connection for mysql is deprecated, see http://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries):

    engine = sqlalchemy.create_engine("mysql+pymysql://....")
    df.to_sql('my_tbl_name', engine, if_exists='replace')
    

    When doing this, your text column will not be truncated, because the sqlalchemy TEXT sql type is used by default. If you want to change this to eg VARCHAR with a certain length, you can use the dtype argument (but this needs pandas 0.16):

    from sqlalchemy.types import VARCHAR
    df.to_sql('my_tbl_name', engine, dtype={'Col_2': VARCHAR(256)})
    

    When you are using an older version of pandas, or you cannot use sqlalchemy for some reason, you can see this answer for a workaround: pandas to_sql truncates my data