Search code examples
mongodbpentahokettlepentaho-spoon

Pentaho PDI - Reading data from MongoDB


I have installed Pentaho Data Integration version (ce-5.0.1.A-stable) in my machine and I am trying to retrive information from MongoDB using PDI. I have created a transformation with Mongo Input step. Now when I try to configure my MongoDB connection details, I couldnt find any explicit connection Type for MongoDB. Could someone please advise on how to configure MongoDB datasource in Pentaho.

enter image description here I have referred most of the Pentaho-MongoDb docs, but none of the solution works out.

Also, I have tried performing below steps as mentioned in Pentaho Official site, but still I couldnt find any connection Type for MongoDB

1- Move the following folder out of the data-integration folder structure: data-integration/plugins/pentaho-big-data-plugin

2- Move the following files out of the data-integration folder structure if they exist: data-integration/libext/JDBC/pentaho-hadoop-hive-jdbc-shim-1.3.0.jar data-integration/libext/JDBC/pentaho-hadoop-hive-jdbc-shim-1.3.1.jar data-integration/libext/JDBC/pentaho-hadoop-hive-jdbc-shim-1.3.2.jar

3- Unzip the file pentaho-big-data-plugin-shimtastic-1.3.3.1.zip from the data-integration/plugins folder.

4- Optionally, remove irrelevant folders under data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations.

5- Copy the file pentaho-hadoop-hive-jdbc-shim-1.3.3.jar into the folder data-integration/libext/JDBC

6- Unzip the file pentaho-instaview-templates-shimtastic-1.3.3.zip to the following directory to data-integration/plugins/spoon/agile-bi/platform/pentaho-solutions/system/instaview/templates/Big Data

Any help is really appreciated..!


Solution

  • Pentaho doesnot have a specific database connection for MongodB. So you will not find it in the Database Connection viewer. The way to connect to Mongodb is to use Mongodb Input step in PDI. There you will find the connection details section (configure credentials). You can then connect JSON Input step to read the results of your mongodb output. Check the below screenshot:

    enter image description here

    You can also read it from the Pentaho Wiki in here. Though the documentation seems to be slightly old, but it is the exact process to do it.

    On a note you don't need Bigdata shims to connect to mongodb. It seems you have configured the hadoop-hive shims. It not required in here.

    Hope it helps :)