Search code examples
pythonpostgresqlpandasblazeodo

Inserting to schema-specific table with python's odo


I'm using python's odo to move data from a pandas dataframe to a postgresql database. The goal is that each "user" sees their own data in their schema, but with an identical data model and table/view naming schema between "users". With normal SQL I can do this:

CREATE SCHEMA my_schema;
CREATE TABLE my_schema.my_table AS select 1;

My DB URI looks like this

db_uri = 'postgresql://localhost/postgres::my_schema.my_table'

This gives me tables in the default schema named "my_schema.my_table", including the '.' in the table name, instead of tables named "my_table" in the schema "my_schema".

I've tried different combinations based on this github issue, such as:

db_uri = 'postgresql://localhost/postgres.schema::tmp')

which gives me this Traceback

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL:  database "postgres/schema" does not exist

And also this one

db_uri = 'postgresql://localhost/postgres::my_schema/my_table'

which gives me tables named "my_schema/my_table".

Here's a sample code:

import pandas as pd
from odo import odo
db_uri = 'postgresql://localhost/postgres::my_schema.my_table'
odo(pd.DataFrame([{'a': 1}, {'a': 1}]), db_uri)

Solution

  • Hidden deep in a mailing list for blaze is a mention of the schema parameter

    d = Data(resource('postgresql://localhost/db::t', schema='myschema'))
    

    which can be used with odo with the following format:

    from odo import odo, drop
    drop(db_uri, schema='my_schema') # to drop table in specific schema
    odo(data, db_uri, schema='my_schema')
    

    working code

    import pandas as pd
    from odo import odo
    db_uri = 'postgresql://localhost/postgres::my_table'
    odo(pd.DataFrame([{'a': 1}, {'a': 1}]), db_uri, schema='my_schema')