Search code examples
pentahokettlepdi

Pentaho PDI Repository connection


  1. Can you explain the difference between different types of repositories in Pentaho PDI and whats the use of having these different repositories?
  2. What is benifit of JNDI and OCI database connection wizard and how to configure these two?

Thanks for your inputs in advance.


Solution

  • Question 1: You have 3 types of repositories: File repository, Database repository and Pentaho repository.

    You can Export/Import from one repository format to an other at any time.

    The File repository saves the transformation/jobs/connections/etc... in xml files. The two others stores them in a database, which means then can be shared between users. The Database (CE) repository contains only the last version, while the Pentaho (EE) repository, for which you have to pay a licence, has version control and other fancy stuff.

    Which one to choose: For a single user, the simplest is the file repository ...Unless you want to query the repository with SQL which may be useful when you are suddenly put in front of an undocumented ETL system in production for migration, upgrade, optimization or debug.

    For multi developers use a database repository, if you plan the developers to use but rarely modify transformation/jobs written by other. Otherwise, if you feel you need a version control for frequent revert, use files shared on a SVN, like github. In which case, other developer will need to download committed modification to keep in synch.

    And of course, if your client can afford to sponsor OpenSource by buying a license, take the Entreprise repository which gives you both: real time modifications and version control.

    Question 2: If you ask the question, use JDBC (OCI) with connection parameters defined in the kettle.property. The JINI is a technology by which multi-user shares the same connection, which appears as a centralized service. In the context of PDI there is really few differences, except when you DBA gives you the connection credentials in JINI or JDBC format.