Search code examples
pythonsql-serverpandasdataframepypyodbc

error when inserting dataframe into MS-SQL with Python


I want to insert a pandas dataframe into MS-SQL using pypyodbc. Here is my code:

Create a dataframe at first:

df = pd.DataFrame({'C1': [11, 21, 31],
                    'C2': [12, 22, 32],
                    'C3': [13, 23, 33]})
tablename = tb

and then insert dataframe into MS-SQL

def insertDFtoSQL(df,tablename):
    con = ppo.connect(r'Driver={SQL Server};
                     'r'Server=se;'
                      r'Database=db;'
                      r'Trusted_Connection=yes;')
    cols = ','.join([k for k in df.dtypes.index])
    params = ','.join('?' * len(df.columns))
    sql = 'INSERT INTO {0} ({1}) VALUES ({2})'.format(tablename, cols, params)
    data = [tuple(x) for x in df.values]
    con.cursor().executemany(sql, data)
    con.commit()
    con.close()

this code resulted in an error message that said "type object is not subscriptable".

However, everything would be right if I used

data1 = [(11,12,13),(22,23,24),(32,33,34)]

to replace data in executemany(sql, data) as executemany(sql, data1)

Any ideas about it?


Solution

  • df.values is a <class 'numpy.ndarray'>, and when you do

    data = [tuple(x) for x in df.values]
    

    you get a list of tuples containing elements of type <class 'numpy.int64'>. pypyodbc is expecting the tuples to contain "normal" Python types, so you'll need to use

    data = [tuple(int(col) for col in row) for row in df.values]
    

    to convert the numbers into plain old int values.