I have no problems with Eclipse's remote debugging when running hadoop in standalone mode. However, it does not work when I'm running hadoop in pseudo-distributed mode. Here's how I attempt eclipse remote debugging with hadoop in pseudo-distributed mode :
I add a line to my hadoop script like so :
#added this line to enable remote debugging
HADOOP_OPTS="$HADOOP_OPTS -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5000"
# run it
exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS -classpath "$CLASSPATH" $CLASS "$@"
And then I create a remote debugging configuration like so :
I run the job from the command line, and it says what it should :
Listening for transport dt_socket at address: 5000
I then go back to eclipse and run the debug configuration. It steps into my main() function like it should :
However, it doesn't hit any of the breakpoints I set in my mapper or reducer.
What's the problem here? How come it worked with hadoop in standalone mode but not pseudo-distributed mode? Is it possible to do remote debugging with hadoop in pseudo-distributed mode? If not, what's the "right" way to debug my mapreduce code in Eclipse?
See Lorand's comment above. Remote debugging will only work in standalone mode.