Our application's userbase has reached 2M users and we are planning to scale up the application using the AWS.
The main problem we are facing is the handling of shared data which includes cache, uploads, models, sessions, etc.
An option is AWS EFS but it will kill the performance of the application as the files will be really small ranging from few Bytes to few MBs and are being updated very frequently.
We can use Memcache/Redis for sessions and S3 for uploads but still need to manage cache, models, and some other shared files.
Is there any alternative to EFS or any way to make EFS work for this scenario where small files are updated frequently?
Small files and frequent updates should not be a problem for EFS.
The problem some users encountered in the original release was that it had two dimensions tightly coupled together -- the amount of throughput available was a function of how much you were paying, and how much you were paying was a function of the total size of the filesystem (all files combined, regardless of individual file sizes)... so the larger, the faster.
But they have, since then, introduced "provisioned throughput," allowing you to decouple these two dimensions.
This default Amazon EFS throughput bursting mode offers a simple experience that is suitable for a majority of applications. Now with Provisioned Throughput, applications with throughput requirements greater than those allowed by Amazon EFS’s default throughput bursting mode can achieve the throughput levels required immediately and consistently independent of the amount of data.
https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-efs-now-supports-provisioned-throughput/
If you use this feature, you pay for the difference between the throughput you provision, and the throughput that would have been included, anyway, based on the size of the data.
See also Amazon EFS Performance in the Amazon Elastic File System User Guide.
Provisioned throughput can be activated and deactivated, so don't confuse this with the fact that there are also two performance modes, called General Purpose and Max I/O, one of which must be selected when creating the filesystem, and this selection can't be changed later. These are related to an optional tradeoff in the underlying infrastructure and the recommended practice is to select General Purpose unless you have a reason not to, based on observed metrics. The Max I/O mode does not have the same metadata consistency model as general purpose.