Search code examples
pythonamazon-s3emramazon-emrmrjob

MRJob fails with Hadoop error copyToLocal: [...] No such file or directory


MRJob fails with error

I'm running a simple Hadoop job using MRJob on a EMR cluster. The job starts normally but then

Job launched 181.2s ago, status STARTING: Provisioning Amazon EC2 capacity
Job launched 211.4s ago, status STARTING: Provisioning Amazon EC2 capacity
Job launched 241.6s ago, status BOOTSTRAPPING: Running bootstrap actions
Job launched 271.8s ago, status BOOTSTRAPPING: Running bootstrap actions
Job on job flow j-7711LTEPTIOB failed with status SHUTTING_DOWN: On the master instance (i-bed4e153), bootstrap action 1 returned a non-zero return code

The EMR log says the following

copyToLocal: `s3://[path-to-file]/mrjob.tar.gz': No such file or directory

However I can see this file was copied as it is on S3 at the correct location.

Please help !


Solution

  • Got it. The error was actually in the [path-to-file]. My $USER contained a backslash '\' and was then used by MRJob as a name for the temporary folder on S3. The '\' is not accepted in a S3 directory name. Solution was to modify $USER in a virtual environment.