Search code examples
node.jsamazon-web-servicesamazon-s3fs

Is it possible to upload a file to S3 without reading or duplicating it?


I'm trying to upload a file to S3 but the file size is too large and we need to do it very frequently . So I was looking for if there is any option to upload a file to S3 using nodejs , without reading the content of the file wholly. The below code was working fine , but it was reading the file each time I want to upload .

const aws = require("aws-sdk");
aws.config.update({
  secretAccessKey: process.env.ACCESS_SECRET,
  accessKeyId: process.env.ACCESS_KEY,
  region: process.env.REGION,
});
const BUCKET = process.env.BUCKET;
const s3 = new aws.S3();
const fileName = "logs.txt";

const uploadFile = () => {
  fs.readFile(fileName, (err, data) => {
    if (err) throw err;
    const params = {
      Bucket: BUCKET, // pass your bucket name
      Key: fileName, // file will be saved as testBucket/contacts.csv
      Body: JSON.stringify(data, null, 2),
    };
    s3.upload(params, function (s3Err, data) {
      if (s3Err) throw s3Err;
      console.log(`File uploaded successfully at ${data.Location}`);
    });
  });
};

uploadFile();

Solution

  • You can use multipart upload:

    AWS article: https://aws.amazon.com/blogs/aws/amazon-s3-multipart-upload/

    SO Question about the same for python: Can I stream a file upload to S3 without a content-length header?

    JS API reference manual: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3/ManagedUpload.html

    The basic example is:

    var upload = new AWS.S3.ManagedUpload({
      params: {Bucket: 'bucket', Key: 'key', Body: stream}
    });
    

    so you have to provide a stream as an input.

    const readableStream = fs.createReadStream(filePath);
    

    JS api documented here: https://nodejs.org/api/fs.html#fscreatereadstreampath-options

    Of course, you can process the data while reading it and then pass it to the S3 API, you just have to implement the Stream API.