Search code examples
hadoopcassandracql3word-countdatastax

Exception in running hadoop wordcount example in cassandra


i have cluster with 2 node and i installed hadoop and cassandra on these nodes correctly, when i run the word count example in [https://github.com/apache/cassandra/tree/trunk/examples/hadoop_cql3_word_count]

(first run wordcountsetup and then create jar from wordcount and run it in hadoop) i get the exception :

Exception in thread "main" java.lang.NoClassDefFoundError: **com/datastax/driver/core/policies/LoadBalancingPolicy**
    at WordCount.run(WordCount.java:236)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at WordCount.main(WordCount.java:78)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: com.datastax.driver.core.policies.LoadBalancingPolicy
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 8 more

Solution

  • Mapreduce program has mainly two sections. Driver section, Map/Reduce section. Driver section will be executed in the same machine where you trigger mapreduce program, whereas Map/Reduce section will be execution in any of your slave nodes. Here you are facing NoClassDefFoundError which means the class com.datastax.driver.core.policies.LoadBalancingPolicy is not accessible either in your Driver code or Mapper/Reduce code.

    If exception from you driver code, then set the environment variable before executing the hadoop command. Need to find the Jar which contains the class com.datastax.driver.core.policies.LoadBalancingPolicy specify the jar full path in the place holder

    export HADOOP_CLASSPATH=<_JAR_NAME>;     # 
    

    The above option won't work if the exception is from Mapper/Reducer. In that case create a fat jar by adding dependent jars in your jar(myJar.jar)

    Or use hadoop -libjars option (Need to override ToolRunner Class in your Main class)