Im trying to use pandas
in a dbt python model (dbt-duckdb), but I keep getting the problem Python model failed: No module named 'pandas'
.
Here you can find my dbt model configuration:
import boto3
import pandas as pd
def model(dbt, session):
dbt.config(
materialized="table",
packages = ["pandas==2.2.3"],
python_version="3.11"
)
key = "my_key"
bucket = "my_bucket"
client = boto3.client('s3')
return None
Also I know duckdb has a way of importing s3 files but I need to manipulate the files before duckdb reads them because they are not correct.
Also this is my models yaml config
version: 2
models:
- name: test
config:
packages:
- "pandas==2.2.3"
Also I have a virtualenvironemnt with pandas installed.
Anyone who has experience with it, thanks in advance!
Found it, make sure that the venv you are using is called dbt-env
dbt will automatically take this venv where you have installed pandas
or whatever package you needed!