I have been trying to load a large language model (> 5 GB) hosted on S3 for use in a Lambda function, but have been so far unsuccessful. The function just continuously times out after a few minutes, even when set on 10240 MB memory.
I assume this is because of the limits on the Lambda function, as well as the streaming of such a large file from S3.
For my implementation, my function needs to be able to load the language model fairly quickly ( ~5-10 seconds).
Being quite new to AWS, is there a better way of doing this?
Store the model in a EFs drive and then attach the efs to the lambda and you can do the calculations faster