node.js amazon-s3 amazon-cloudfront image-scaling

Best way to save images on Amazon S3 and distribute them using CloudFront

The application I'm working on (nodejs) has user profiles and each profile can have multiple images. I'm using S3 as my main storage and CloudFront to distribute them.

The thing is sometimes users upload large images and what I want to do is to scale the image when downloading it (view it in an html img tag, or a mobile phone) mainly because of performance.

I don't know if I should scale the image BEFORE uploading it to S3 (maybe using lwip https://github.com/EyalAr/lwip) or is there a way of scaling the image or getting a low quality image when downloading it through CloudFront?. I've read that CloudFront can compress the files using Gzip but also not recommended for images.

I also don't want to upload a scaled + original image to S3 because of the storage.

Should be done in client, server or S3? what is the best way of doing it?

Solution

is there a way of scaling the image or getting a low quality image when downloading it through CloudFront?

There is no feature like this. If you want the image resized, resampled, scaled, compressed, etc., you need to do it before it is saved to its final location in S3.

Note that I say its final location in S3.

One solution is to upload the image to an intermediate location in S3, perhaps in a different bucket, and then resize it with code that modifies the image and stores it in the final S3 location, whence CloudFront will fetch it on behalf of the downloading user.

I've read that CloudFront can compress the files using Gzip but also not recommended for images.

Images benefit very little from gzip compression, but the CloudFront documentation also indicates that CloudFront doesn't compress anything that isn't in some way formatted as text, which tends to benefit much more from gzip compression.

I also don't want to upload a scaled + original image to S3 because of the storage.

I believe this is a mistake on your part.

"Compressing" images is not like compressing a zip file. Compressing images is lossy. You cannot reconstruct the original image from the compressed version because image compression as discussed here -- by definition -- is the deliberate discarding information from the image to the point that the size is within the desired range and while the quality is in an acceptable range. Image compression is both a science and an art. If you don't retain the original image, and you later decide that you want to modify your image compression algorithm (either because you later decide the sizes are still too large or because you decide the original algorithm was too aggressive and resulted in unacceptably low quality), you can't run your already-compressed images through the compression algorithm a second time without further loss of quality.

Use S3's STANDARD_IA ("infrequent access") storage class to cut the storage cost of the original images in half, in exchange for more expensive downloads -- because these images will rarely ever be downloaded again, since only you will know their URLs in the bucket where they are stored.

Should be done in client, server or S3?

It can't be done "in" S3 because S3 only stores objects. It doesn't manipulate them.

That leaves two options, but doing it on the server has multiple choices.

When you say "server," you're probably thinking of your web server. That's one option, but this process can be potentially resource-intensive, so you need to account for it in your plans for scalability.

There are projects on GitHub, like this one, designed to do this using AWS Lambda, which provides "serverless" code execution on demand. The code runs on a server, but it's not a server you have to configure or maintain, or pay for when it's not active -- Lambda is billed in 100 millisecond increments. That's the second option.

Doing it on the client is of course an option, but seems potentially more problematic and error-prone, not to mention that some solutions would be platform-specific.

There isn't a "best" way to accomplish this task.

If you aren't familiar with EXIF metadata, you need to familiarize yourself with that, as well. In addition to resampling/resizing, you probably also need to strip some of the metadata from user-contributed images, to avoid revealing sensitive data that your users may not realize is attached to their images -- such as the GPS coordinates where the photo was taken. Some sites also watermark their user-submitted images this also would be something you'd probably do at the same time.