In a small program, I'm using the Pandas to_sql method to push data to the database. But in my scenario, I need to set auto-commit to false, but Pandas commits nonetheless. Is there a workaround for this?
# Example code
df = pd.read_csv(file)
# In here i need to turn auto commit off
with conn.connect() as c:
df.to_sql('table_name', c)
# Some critical workload
# updating another table
# Then commit after updating the table
c.commit()
With SQLAlchemy 2.0, with engine.connect() as c:
and with engine.begin() as c:
normally differ in that the former will not automatically commit when the context manager exits, while the latter will automatically commit (unless an error occurs). So,
engine.echo = True
with engine.connect() as c:
c.exec_driver_sql("INSERT INTO table_name (id) VALUES (1)")
print(">>> Done.")
results in a rollback
… INSERT INTO table_name (id) VALUES (1)
… [raw sql] ()
… ROLLBACK
>>> Done.
However, to_sql()
does automatically commit
df = pd.DataFrame([(1,), (2,)], columns=["id"])
engine.echo = True
with engine.connect() as c:
df.to_sql("table_name", c, if_exists="append", index=False)
print(">>> Done.")
… INSERT INTO table_name (id) VALUES (?)
… [generated in 0.00087s] [(1,), (2,)]
… COMMIT
>>> Done.
and the commit happens immediately after the .to_sql()
completes, as shown by adding a breakpoint within the context manager
df = pd.DataFrame([(1,), (2,)], columns=["id"])
engine.echo = True
with engine.connect() as c:
df.to_sql("table_name", c, if_exists="append", index=False)
breakpoint()
print("Exiting context manager.")
print(">>> Done.")
2023-06-28 07:09:13,887 INFO sqlalchemy.engine.Engine INSERT INTO table_name (id) VALUES (?)
2023-06-28 07:09:13,887 INFO sqlalchemy.engine.Engine [generated in 0.00066s] [(1,), (2,)]
2023-06-28 07:09:13,912 INFO sqlalchemy.engine.Engine COMMIT
> /home/gord/PycharmProjects/sqlalchemy_demo/so76566226.py(14)<module>()
-> print("Exiting context manager.")
(Pdb)
But, .to_sql()
only seems to immediately automatically commit if a transaction is not active when it is invoked, so if we begin the transaction ourselves then the .to_sql()
won't be committed until we commit the transaction ourselves
df = pd.DataFrame([(1,), (2,)], columns=["id"])
engine.echo = True
with engine.connect() as c, c.begin():
df.to_sql("table_name", c, if_exists="append", index=False)
breakpoint()
c.commit()
print(">>> Done.")
2023-06-28 07:15:57,699 INFO sqlalchemy.engine.Engine INSERT INTO table_name (id) VALUES (?)
2023-06-28 07:15:57,699 INFO sqlalchemy.engine.Engine [generated in 0.00052s] [(1,), (2,)]
> /home/gord/PycharmProjects/sqlalchemy_demo/so76566226.py(14)<module>()
-> c.commit()
(Pdb) next
2023-06-28 07:16:13,131 INFO sqlalchemy.engine.Engine COMMIT
> /home/gord/PycharmProjects/sqlalchemy_demo/so76566226.py(15)<module>()
-> print(">>> Done.")
(Pdb)