Was doing an assignment for class. Running the mapper and reducer codes in my local system went fine and got the desired output. I have a feeling that there's something wrong with Hadoop.
Here's the screenshots from my Linux Terminal. Just the top and bottom. top bottom
My friend told me to check the logs in the Hadoop folder. The first one had this:
Sep 12, 2023 7:58:26 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class Sep 12, 2023 7:58:26 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class Sep 12, 2023 7:58:26 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class Sep 12, 2023 7:58:26 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM' Sep 12, 2023 7:58:26 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton" Sep 12, 2023 7:58:26 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton" Sep 12, 2023 7:58:26 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest" log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
The last one was empty. All the others have this:
/usr/bin/env: 'python': No such file or directory
Now, Python is installed. This is however the first time I'm running a Hadoop Job on my VM. I followed the installation provided by my professor so I don't really know what's gone wrong.
It seems you may have #!/usr/bin/env python
at the top of your file?
The error refers to the python
binary is not on the OS PATH
(even if it may be installed), so cannot be set with env
command.
That file heading isn't necessary, as long as you can pass python
executable to Mapreduce command.
However, no one really writes mapreduce like this anymore, after creation of Spark / PySpark or mrjob
libraries, so unfortunately I think you're being taught outdated information.
log4j:WARN Please initialize the log4j system properly
This is a separate issue, but you should definitely try to fix it for debugging other YARN problems.