Search code examples
pentahopentaho-spoon

MongoDB connection with Pentaho Kettle (PDI)


I've just downloaded Pentaho Data Integration Community (pdi-ce-6.1.0.1-196) a.k.a. Kettle, with the goal of designing an ETL routine to make nightly migrations from MongoDB scheme into PostgreSQL.

I couldn't achieve the very first task: create a MongoDB connection. MongoDB is not listed as a Connection Type in the New Connection dialog, so I chose Generic database. Then, I failed to find anything related to MongoDB in the Custom Driver Class Name field required for the generic connection.

Is it possible that the installation/configuration went wrong with Kettle? I remember that I had to kill the first startup because it hanged forever.

Or does PDI-CE lacks some component that I must get somewhere else?


Solution

  • PDI handles Mongodb differently than other databases.

    If working on a transformation (vs a job), go to the "Big Data" group of steps and there are two steps - one for MongoDB Input and one for MongoDB Output.

    Within those steps you specify the connection information to your database.

    Hope that helps,

    Mark

    P.S. There is also a "MongoDB Delete" in the marketplace that comes in useful when deleting data from collections.