Heroku doesn't update github file system when an image is uploaded from website

I ran into the problem where Heroku doesn't update my GitHub repository (or say static filesystem) when a blog post (including pictures) is created from the website.

Other images survive, whilst the ones saved in my filesystem with the server running on heroku, disapper.

I found this on their documentation.

The Heroku filesystem is ephemeral - that means that any changes to the filesystem whilst the dyno is running only last until that dyno is shut down or restarted.

I'm still confused why not all the pictures disappear and only those added later do.

Is AWS S3 a solution for this? If it is, how can I represent my filesystem using buckets?

Say, for the Blog Post 1 I have 2 picture resolutions, which means storing the files in different folders corresponding to those resolutions.

---1920x1920
-----picture.jpg
---800x800
-----picture.jpg

Does that mean I have to create 2 buckets named 1920x1920 and 800x800 or is there a better way of handling them?

Solution

Is AWS S3 a solution for this?

S3 is the recommended solution for this, and the configuration is documented in Heroku DevCentre with specfic instructions for uploading from Python.

Note these Python instructions use the Direct Upload approch: Have the flask app generate a pre-signed URL, which is then passed back to the client Javascript code, so that the user's browser can make the upload to S3 directly. The resulting S3 URL of the image, is then put into a hidden element in the form, which is then received by your app on form submit.

The fact that you have separate image sizes suggests your app does some processing (maybe with PIL) to get these thumbnails. In which case it may be easier to use the Pass-Through approach where your app implements its own upload mechanism, does the processing and then uploads the thumbnails to S3 (The upload to S3 part is well document, such as in this SO thread).

The Pass-Through method carries the warning that this may cause blocking of a single threaded worker. If your site gets a volume of requests that causes this to be an issue, you may need to increase the number of gunicorn workers, or change to a worker type that supports concurrency (This github post has some useful commands/info on concurrent worker types).

The best way to implement this whole thing (although the requirement for a redisgo dyno and worker dyno may push you into the paid teir) may be with Background Tasks using rq. You use the Direct-Upload approach above to upload the original image, then have a background job download that, do the resizing, and put the resulting thumbnails back onto S3.

Does that mean I have to create 2 buckets named 1920x1920 and 800x800 or is there a better way of handling them?

Have one Bucket for the entire app, and just include forward slashes in the object's key to mimic a subdirectory structure.