Search code examples
pythoncygwinhadoopmapreduce

cygwin hadoop map-reduce problem


I am having problem getting map/reduce example to work on cygwin: http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/

Under cygwin, passing -mapper=mapper.py result in "CreateProcess error=193, %1 is not a valid Win32 application"

I try to use -mapper="python mapper.py", it give error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1

Anyone has success running hadoop map/reduce using python under cygwin?

Thanks.


Solution

  • I have had success with that tutorial under Cygwin. I am using hadoop-0.20.2, under Cygwin 1.7.9-1 on WinXP. I haven't seen your exact message ... I'm answering, though, because I did have some trouble with the -mapper option and solved it by putting the python scripts in the /tmp directory. I saw some error messages that made me think there was some confusion about how the /home directory was named under Cygwin. I decided to avoid that by using /tmp and had success. I used single-quotes, too, BTW. Sometimes pasting double quotes in Windows gives you a character that a Unix process doesn't understand.

    BTW, I also made use of the tutorial re. getting Hadoop going under Cygwin and Eclipse here: http://ebiquity.umbc.edu/Tutorials/Hadoop/ The Eclipse/java stuff near the end did not work for me and isn't how I was planning to work with Hadoop. The first few steps, though, were helpful in getting a pseudo-cluster going.