Search code examples
serializationhadoopnullpointerexceptionmapreducemrunit

java.lang.NullPointerException using MRUnit. Custom key serialization error


I'm trying to test a simple MapReduce project using MRUnit. I set the input for the mapDriver and then call mapDriver.runTest() (I've also tried with mapDriver.run() but produces the same error).

I have written a custom key which overloads the write(DataOutput out), readFields(DataInput in) and compareTo(...) Methods. When debugging, the Key correctly writes its data using write(DataOutput out). However, after the key's readFields(DataInput in) method (which correctly retrieves the data that was previously written using write(DataOutput out)) finishes, the error below is thrown.

I have searched on here for similar posts and have tried overriding the hashCode() and equals() methods to no avail. Does MRUnit require any additional methods to be overriden when using custom keys? This post is most similar to MRUnit with Avro NullPointerException in Serialization. However, I am not using avro, and as far as I am aware, I am using default serialization. Cheers!

java.lang.NullPointerException
at org.apache.hadoop.mrunit.Serialization.copy(Serialization.java:61)
at org.apache.hadoop.mrunit.Serialization.copy(Serialization.java:81)
at org.apache.hadoop.mrunit.mapreduce.mock.MockContextWrapper$4.answer(MockContextWrapper.java:78)
at org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
at org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
at org.apache.hadoop.mapreduce.Mapper$Context$$EnhancerByMockitoWithCGLIB$$f555e120.write(<generated>)
at model.RMSEEvaluation$Mapper.map(RMSEEvaluation.java:57)
at model.RMSEEvaluation$Mapper.map(RMSEEvaluation.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mrunit.mapreduce.MapDriver.run(MapDriver.java:221)
at org.apache.hadoop.mrunit.MapDriverBase.runTest(MapDriverBase.java:150)
at org.apache.hadoop.mrunit.TestDriver.runTest(TestDriver.java:137)
at test.TestRMSEEvaluation.testSetValues(TestRMSEEvaluation.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

Solution

  • I have found a solution to this error. The error was because the type of serialization had not been set in the Configuration for MapDriver mapDriver. I had to explicitly set the serialization using the following:

    Configuration conf = new Configuration();
    conf.set("io.serializations","org.apache.hadoop.io.serializer.JavaSerialization," 
                + "org.apache.hadoop.io.serializer.WritableSerialization");
    mapDriver.setConfiguration(conf);
    

    Hope this helps anyone with a similar problem!