Search code examples
pentahokettle

How to deploy scheduled Kettle jobs on Pentaho BI server v6 CE


I have a server running Pentaho BI server v6 Community Edition. We've developed a Kettle job to extract from one database to another, exported as a KJB file. I would like to run this job every 12 or so hours.

I noticed that the BI server already included Kettle, and has the ability to upload and schedule jobs. Do I need to install the DI server if the BI server already has Kettle installed?

If not, how can I publish the KJB file into the BI server? I'd like to use a file system repository. If I upload the file directly through the user console the log shows that the import was a success, but I cannot select or run the job anywhere.


Solution

  • I use Pentaho BI server 5, but it should work same on Pentaho BI 6.

    My Kettle job runs many sub-transformations. Transformation files are stored on file system directory e.g. /opt/etl.

    So lets say I have one job (daily_job.kjb) with two sub-transformations.

    To run a Kettle job on Pentaho BI CE I use those steps:

    1. set up a transformation location properly in job file
    2. upload sub-transformations to proper directory on server (/opt/etl)
    3. create xaction file which executes Kettle job on BI server (daily.xaction)
    4. upload daily.xaction and daily_job.kjb files to Pentaho BI server (same folder)
    5. schedule daily.xaction file on Pentaho BI server

    Job settings in daily_job.kjb:

    enter image description here

    Xaction code daily.xaction (simply it executes daily_job.kjb located in same folder in BI server as where xaction is):

    <?xml version="1.0" encoding="UTF-8"?>
    <action-sequence> 
      <title>My scheduled job</title>
      <version>1</version>
      <logging-level>ERROR</logging-level>
      <documentation> 
        <author>mzy</author>  
        <description>Sequence for running daily job.</description>  
        <help/>  
        <result-type/>  
        <icon/> 
      </documentation>
    
      <inputs> 
      </inputs>
    
      <outputs> 
        <logResult type="string">
          <destinations>
            <response>content</response>
          </destinations>
        </logResult>
      </outputs>
    
      <resources>
        <job-file>
          <solution-file> 
            <location>daily_job.kjb</location>  
            <mime-type>text/xml</mime-type> 
          </solution-file>     
        </job-file>
      </resources>
    
      <actions> 
        <action-definition>
          <component-name>KettleComponent</component-name>
          <action-type>Pentaho Data Integration Job</action-type>
          <action-inputs>   
          </action-inputs>
          <action-resources>
            <job-file type="resource"/>
          </action-resources>
          <action-outputs> 
            <kettle-execution-log type="string" mapping="logResult"/>  
            <kettle-execution-status type="string" mapping="statusResult"/> 
          </action-outputs>   
          <component-definition>
            <kettle-logging-level><![CDATA[info]]></kettle-logging-level>           
          </component-definition>
        </action-definition>
    
      </actions> 
    </action-sequence>
    

    Scheduling Kettle job (xaction file) on Pentaho BI CE:

    enter image description here