Search code examples
pythonsqlduckdb

Replace white space from column name with underscore in DuckDB Python client API


I have a DuckDB table whose column names have white spaces, and I'd like to just specify a blanket rule that says "for all columns with spaces, replace it with an underscore". I know how to do this by converting the table to a Polars DataFrame, but can I do it without the conversion, preferably using DuckDB's relational API?

Sample data

test.csv

"id","company name"
1,"Walmart"
2,"Amazon"
3,"Apple"
df = duckdb.sql("SELECT * FROM test.csv")

This yields the company name column name with spaces.

┌───────┬──────────────┐
│  id   │ company name │
│ int64 │   varchar    │
├───────┼──────────────┤
│     1 │ Walmart      │
│     2 │ Amazon       │
│     3 │ Apple        │
└───────┴──────────────┘

Polars works

df = duckdb.sql("SELECT * FROM 'test.csv'").pl()

# Replace spaces with underscores in column names using Polars
df.columns = list(map(lambda x: x.replace(" ", "_"), df.columns))
df_new = duckdb.sql("SELECT * from df")
┌───────┬──────────────┐
│  id   │ company_name │
│ int64 │   varchar    │
├───────┼──────────────┤
│     1 │ Walmart      │
│     2 │ Amazon       │
│     3 │ Apple        │
└───────┴──────────────┘

What I want to do

I'm unable to find any examples on how to do this directly, without conversion to polars in DuckDB's Python client API. The docs show how to use ALTER TABLE via raw SQL), but my goal is to do this via Python.


Solution

  • If you are ok with normalizing all non-alphanumeric characters in your column names, you can use the normalize_names parameter of the *_csv_auto functions

    The python function from_csv_auto

    df = duckdb.from_csv_auto('test.csv', normalize_names=True)
    

    The sql-like read_csv_auto

    df = duckdb.sql("select * from read_csv_auto('test.csv', normalize_names=True)")
    

    This will replace the spaces with underscores.

    ┌───────┬──────────────┐
    │  id   │ company_name │
    │ int64 │   varchar    │
    ├───────┼──────────────┤
    │     1 │ Walmart      │
    │     2 │ Amazon       │
    │     3 │ Apple        │
    └───────┴──────────────┘