Search code examples
pythont-sqlstored-procedurespyodbcexecutemany

Can I use pyodbc executemany with sql stored procedure?


I need to write a lot of data from pandas dataframes to MS-SQL tables (1000's or even more rows at once).

I'm using pyodbc executmany with fast_executemany=True, otherwise it takes hours for each table. Now the IT guys want me to start working with stored procedures instead of direct access to the DB for security reasons.

The problem is that from what I've seen, SQL is, how to say, not as flexible as python, and I have to declare every variable passed to the SP, so I don't see how I can pass the SP with df.values.tolist() as I'm doing today with the executemany function on direct access to the DB.

For example, let's say I have a dataframe df:

id name salary

1 Josh 10000

2 Michael 5000

3 Sara 8000

Today I would use:

cursor.fast_executemany = True
insert_str = "INSERT INTO [%s].[%s] VALUES (?, ?, ?)"
cursor.executemany(insert_str % (scheme, table), df.values.tolist()])

And all the dataframe would be inserted to the table at once (and fast). But, is doesn't seem possible to do the same with calling an SP.

Is there a way to use pyodbc executemany with stored procedures?


Solution

  • Yes, we can use executemany with Stored Procedures in SQL Server:

    Table:

    CREATE TABLE [dbo].[Table_1](
        [id] [int] NOT NULL,
        [name] [nvarchar](50) NULL,
        [salary] [int] NULL,
     CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED 
    (
        [id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    

    Stored Procedure:

    CREATE PROCEDURE [dbo].[Table_1_insert] 
        @id int = 0, 
        @name nvarchar(50),
        @salary int
    AS
    BEGIN
        SET NOCOUNT ON;
        INSERT INTO Table_1 (id, name, salary) VALUES (@id, @name, @salary);
    END
    

    Python code:

    df = pd.DataFrame(
        [(1, 'Josh', 10000), (2, 'Michael', 5000), (3, 'Sara', 8000)], 
        columns=['id', 'name', 'salary'])
    crsr.executemany("{CALL dbo.Table_1_insert(?, ?, ?)}", df.values.tolist())
    cnxn.commit()