I am trying to write a Polars DataFrame to a duckdb database. I have the following simple code which I expected to work:
import polars as pl
import duckdb
pldf = pl.DataFrame({'mynum': [1,2,3,4]})
with duckdb.connect(database="scratch.db", read_only=False) as con:
pldf.write_database(table_name='test_table', connection=con)
However, I get the following error:
sqlalchemy.exc.ArgumentError: Expected string or URL object, got <duckdb.duckdb.DuckDBPyConnection object
I get a similar error if I use the non-default engine='adbc'
instead of df.write_database()
's default engine='sqlalchemy'
.
So it seemed it should be easy enough to just swap in a URI for my ducdkb database, but I haven't been able to get that to work either. Potentially it's complicated by my being on Windows?
In-memory database. If you just want to use DuckDB to query a polars dataframe, this can simply be achieved as long as the table exists in the current scope.
duckdb.sql("SELECT * FROM df").show()
Persistent database If you want to use a persistent database, you could install duckdb-engine
and write the database using the connection URI string.
df.write_database(
table_name='test_table',
connection="duckdb:///scratch.db",
)
Reading the data back in using DuckDB works as usual.
with duckdb.connect(database="scratch.db", read_only=False) as con:
con.query("SELECT * FROM test_table").show()
┌───────┐
│ mynum │
│ int64 │
├───────┤
│ 1 │
│ 2 │
│ 3 │
│ 4 │
└───────┘