Search code examples

Apache Phoenix IndexTool failing with java.lang.ClassNotFoundException: org.apache.tephra.TransactionSystemClient

I have Cloudera CDH 5.14.2 cluster with Apache Phoenix Parcel installed (APACHE_PHOENIX-4.14.0-cdh5.14.2.p0.3).

I have a table containing secondary index and I would like to populate this index using IndexTool provided with Apache Phoenix. But this is giving me the following error:

19/01/02 13:58:10 INFO mapreduce.Job: The url to track the job: http://mor-master-01.triviadata.local:8088/proxy/application_1546422102410_0020/
19/01/02 13:58:10 INFO mapreduce.Job: Running job: job_1546422102410_0020
19/01/02 13:58:18 INFO mapreduce.Job: Job job_1546422102410_0020 running in uber mode : false
19/01/02 13:58:18 INFO mapreduce.Job:  map 0% reduce 0%
19/01/02 13:58:22 INFO mapreduce.Job: Task Id : attempt_1546422102410_0020_m_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.tephra.TransactionSystemClient
        at java.lang.ClassLoader.loadClass(
        at sun.misc.Launcher$AppClassLoader.loadClass(
        at java.lang.ClassLoader.loadClass(
        at org.apache.phoenix.transaction.TransactionFactory$Provider.<clinit>(
        at org.apache.phoenix.query.QueryServicesOptions.<clinit>(
        at org.apache.phoenix.query.QueryServicesImpl.<init>(
        at org.apache.phoenix.jdbc.PhoenixDriver.getQueryServices(
        at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(
        at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(
        at org.apache.phoenix.jdbc.PhoenixDriver.connect(
        at java.sql.DriverManager.getConnection(
        at java.sql.DriverManager.getConnection(
        at org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(
        at org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(
        at org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(
        at org.apache.phoenix.mapreduce.PhoenixInputFormat.createRecordReader(
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(
        at org.apache.hadoop.mapred.MapTask.runNewMapper(
        at org.apache.hadoop.mapred.YarnChild$
        at Method)
        at org.apache.hadoop.mapred.YarnChild.main(

When I check my HBASE_CLASSPATH with command ${HBASE_HOME}/bin/hbase classpath, I see it contains the following jars:


When I check the source code and it's dependencies I see that the missing class is part of the $PHOENIX_HOME/lib/phoenix/lib/tephra-core-0.14.0-incubating.jar

When I grep the content of this JAR for the missing class I see that it is there:

# jar tf $PHOENIX_HOME/lib/phoenix/lib/tephra-core-0.14.0-incubating.jar | grep TransactionSystemClient

Do you have an idea why MR job cannot find this particular class?

If it helps, my table with secondary index is defined as follows:

0: jdbc:phoenix:localhost:2181/hbase> create table t1(v1 varchar, v2 varchar, v3 integer constraint primary_key primary key(v1)) immutable_rows=true, compression='SNAPPY';
1: jdbc:phoenix:localhost:2181/hbase> create index glb_idx on t1(v2) async;

And I run IndexTool with the command

${HBASE_HOME}/bin/hbase org.apache.phoenix.mapreduce.index.IndexTool -dt T1 -it GLB_IDX -op /tmp

When I create index synchronously and upsert some data into the table, index is populated correctly, so Phoenix secondary index configuration looks OK.


  • This is a problem of the missing JARs on the classpath of the map-reduce job started by this command. By the trial and error I put together list of dependencies required required to run IndexTool on CDH 5.14.2.

    cat classpath.txt

    Then using Cloudera Manager I added all these JARs to mapreduce.application.classpath property in mapred-site.xml on each worker node.

    Some of these dependencies are also required to submit the MR job, so I set HADDOP_CLASSPATH on the edge node where I'm running job from to contain all these JARs.

    export HADDOP_CLASSPATH=$(paste -s -d: classpath.txt) 

    Then I could run the job with the following command

    hadoop jar /opt/cloudera/parcels/APACHE_PHOENIX-4.14.0-cdh5.14.2.p0.3/lib/phoenix/lib/phoenix-core-4.14.0-cdh5.14.2.jar org.apache.phoenix.mapreduce.index.IndexTool -s SCHEMA_NAME -dt TEST_TABLE -it INDEX_TABLE -op /tmp