I've been exploring the topic of upload sessions, and I still have a lot of questions.
I've observed how some websites implement upload sessions, such as Mangadex, in the following way:
- Send a request to the server to retrieve the old upload session ID. If it exists, they delete the old session.
- Send another request to get a new upload session ID.
- Batch upload user files to S3 (3 files per request). The response contains some information like a generated file ID (Uuid v4 in this case), file hash, original file name, etc.
- Send a commit request to the server to confirm the upload session is complete. The server processes the session (resize, scramble, etc.).
My questions:
Can I handle the file upload process from the client to S3 without an intermediary server that generates a file ID, or must I pass the file through an intermediary to generate the file ID?
If I must pass to the server to generate file IDs, how can I scale when it can handle up to 1000 images per minute? My VPS has 4GB of RAM.
The process above is only my assumption. If it is wrong, please provide me with the correct way to implement a file upload session.
You could use presignedURL to do that:
- Send a request from client side to server side to generate presignedURL (server side will authenticate and process where the file will be placed,... )
- After that, server will send back to client side a presignedURL with a timeout setting. Client side will use that URL to make a request to upload file directly into S3
You maybe ask about the file information (size, metadata,... ) You could get that information and send to server in the first request. It means you will send the file information in the request which send back the presignedURL. At that process, you could verify something such as:
- Is client side domain in your system?
- File type in the acceptance list?
- ... etc
Note:
- Maybe we would need to verify some properties in both client side and server
- If you uploaded many times with the same presignedURL. It would replace the old one
- Maybe you need another service to check the file content (in case end-user uploads sensitive file or something like that)
I hope my idea could help you