Search code examples
cluster-computingpentahokettlepentaho-spoon

Pentaho Data Integration How to run job with kitchen on carte cluster?


I had set up a carte cluster (1 master and 2 slaves) and run a job on carte cluster with spoon. But when I ran with kitchen command or carte http access, it ran as standalone (just run in master node).

Did I miss anything in the configuration? Or doesn't it support cluster mode?

Here was what I tried:

  1. my config:

enter image description here

enter image description here

  1. ran in spoon with "Enviroment Type -- Local"

    master output:

    2017/11/28 04:47:09 - RepositoriesMeta - Reading repositories XML file: /root/.kettle/repositories.xml
    Tue Nov 28 04:47:09 EST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must   be       established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to   explicitly disable SSL       by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    2017/11/28 04:47:10 - sortcluster111 (master) - Dispatching started for transformation [sortcluster111 (master)]
    Tue Nov 28 04:47:10 EST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must   be       established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to   explicitly disable SSL       by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    Tue Nov 28 04:47:10 EST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must   be       established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to   explicitly disable SSL       by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    2017/11/28 04:47:10 - output.0 - Connected to database [102] (commit=1000)
    2017/11/28 04:47:10 - input.0 - Finished reading query, closing connection.
    2017/11/28 04:47:10 - input.0 - Finished processing (I=47, O=0, R=0, W=47, U=0, E=0)
    2017/11/28 04:47:10 - input.0 - Server socket accepted for port [40001], reading from server Dynamic slave [kettleslave02:8083]
    2017/11/28 04:47:10 - input.0 - Server socket accepted for port [40000], reading from server Dynamic slave [kettleslave01:8082]
    2017/11/28 04:47:10 - output.0 - Finished processing (I=47, O=47, R=0, W=47, U=0, E=0)      
    

    slave01 output:

    2017/11/28 04:47:09 - RepositoriesMeta - Reading repositories XML file: /root/.kettle/repositories.xml
    Tue Nov 28 04:47:09 EST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must  be        established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to  explicitly disable SSL        by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    2017/11/28 04:47:10 - sortcluster111 (cluster1:Dynamic slave [kettleslave01:8082]) - Dispatching started for transformation [sortcluster111 (cluster1:Dynamic slave [kettleslave01:8082])]
    2017/11/28 04:47:10 - sort.0 - Server socket accepted for port [40000], reading from server kettlemaster01
    2017/11/28 04:47:10 - sort.0 - Finished processing (I=24, O=0, R=0, W=24, U=0, E=0)
    

    slave02 output:

    2017/11/28 04:47:09 - RepositoriesMeta - Reading repositories XML file: /root/.kettle/repositories.xml
    2017/11/28 04:47:09 - General - Unable to connect to the repository with name 'Mysqlrep'
    2017/11/28 04:47:10 - sortcluster111 (cluster1:Dynamic slave [kettleslave02:8083]) - Dispatching started for transformation [sortcluster111 (cluster1:Dynamic slave [kettleslave02:8083])]
    2017/11/28 04:47:10 - sort.0 - Server socket accepted for port [40000], reading from server kettlemaster01
    2017/11/28 04:47:10 - sort.0 - Finished processing (I=23, O=0, R=0, W=23, U=0, E=0)        
    
  2. ran with kitchen:

    kitchen.sh -rep=Mysqlrep -user=admin -pass=admin -job trans1
    

    master output:

    2017/11/28 04:10:19 - trans1 - Starting entry [sorttrans]
    2017/11/28 04:10:19 - sorttrans - Loading transformation from repository [sortcluster111] in directory [/]
    2017/11/28 04:10:19 - sorttrans - Using run configuration [cluster config]
    2017/11/28 04:10:19 - sorttrans - Using legacy execution engine
    2017/11/28 04:10:19 - sortcluster111 - Dispatching started for transformation [sortcluster111]
    Tue Nov 28 04:10:19 EST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be         established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL         by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    Tue Nov 28 04:10:19 EST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be         established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL         by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    2017/11/28 04:10:19 - output.0 - Connected to database [102] (commit=1000)
    2017/11/28 04:10:19 - input.0 - Finished reading query, closing connection.
    2017/11/28 04:10:19 - input.0 - Finished processing (I=47, O=0, R=0, W=47, U=0, E=0)
    2017/11/28 04:10:19 - sort.0 - Finished processing (I=0, O=0, R=47, W=47, U=0, E=0)
    2017/11/28 04:10:19 - output.0 - Finished processing (I=0, O=47, R=47, W=47, U=0, E=0)
    2017/11/28 04:10:19 - trans1 - Starting entry [finish]
    2017/11/28 04:10:19 - trans1 - Finished job entry [finish] (result=[true])
    2017/11/28 04:10:19 - trans1 - Finished job entry [sorttrans] (result=[true])
    2017/11/28 04:10:19 - trans1 - Finished job entry [SQL] (result=[true])
    2017/11/28 04:10:19 - trans1 - Job execution finished
    2017/11/28 04:10:19 - Kitchen - Finished!
    2017/11/28 04:10:19 - Kitchen - Start=2017/11/28 04:10:00.586, Stop=2017/11/28 04:10:19.739
    2017/11/28 04:10:19 - Kitchen - Processing ended after 19 seconds.
    

    has no output in the slave

Regards

John


Solution

  • There is a bug in new versions of pdi: The option "Run this transformation in a clustered mode?" not exists, so to fix that you need open the job xml file and remove the property run_configuration and set cluster to Y from entry's of transformation that you wanna run in clustered mode. Hope this helps.