Search code examples
hadoopamazon-s3jets3t

jets3t cannot upload file to s3


I'm trying to upload files from local to s3 using hadoop fs and jets3t, but I'm getting the following error

Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: Request Error. HEAD '/project%2Ftest%2Fsome_event%2Fdt%3D2015-06-17%2FsomeFile' on Host 'host.s3.amazonaws.com' @ 'Thu, 18 Jun 2015 23:33:01 GMT' -- ResponseCode: 404, ResponseStatus: Not Found, RequestId: AVDFJKLDFJ3242, HostId: D+sdfjlakdsadf\asdfkpagjafdjsafdj

I'm confused as to why jets3t need to do a HEAD request for an upload. Since the files I'm uploading doesn't exist on s3 yet, of course it shouldn't be found.

I'm assuming since I have a 404 error, it cannot be a permission issue.

The code that's invoking this error is:

import org.apache.hadoop.conf.Configuration;
  import org.apache.hadoop.fs.FileStatus;
  import org.apache.hadoop.fs.FileSystem;
  import org.apache.hadoop.fs.Path;

...

  String path = "s3n://mybucket/path/to/event/partition/file"
  Configuration conf = new Configuration();
  conf.set("fs.s3n.awsAccessKeyId", "MYACCESSKEY");
  conf.set("fs.s3n.awsSecretAccessKey", "MYSECRETKEY");
  FileSystem fileSystem = FileSystem.get(URI.create(path), conf);
  fileSystem.moveFromLocalFile("my/source/path/to/file", path);

Solution

  • Alright, I'm answering this for posterity. The issue was actually maven.

    It seems that I was using incompatible version of the two framework. Of course maven being maven, cannot detect this.