Search code examples
pythonserverless-framework

Serverless Framework, Python Service Too Large


I have a very simple Serverless Framework service. In the service, I am using the serverless-python-requirements plugin for deployment. I am successfully able to deploy from my local machine, but the deployment fails in my CI/CD pipeline:

UPDATE_FAILED: AzurePhishSimReportFunctionLambdaFunction (AWS::Lambda::Function) Resource handler returned message: "Unzipped size must be smaller than 262144000 bytes (Service: Lambda, Status Code: 400, Request ID: 2b53c48c-adb8-45b8-8844-8bfb317612bb)" (RequestToken: 126200f8-fd75-9bd8-4c0e-6ebf4d0c4e40, HandlerErrorCode: InvalidRequest)

When deploying from my local machine, the service is zipped into a small package:

Uploading service azure-phishing-simulation-reporter.zip file to S3 (16.58 MB)

However, when deploying from CI/CD, the package is significantly larger:

Uploading service azure-phishing-simulation-reporter.zip file to S3 (136.71 MB)

My understanding is that this may be because of local caching, however the following proposed solution had no effect. https://forum.serverless.com/t/different-package-sizes-when-deploying/7819/2. Regardless, I am not overly concerned about this, as I plan to deploy via CI/CD.

An unfortunate reality - the service only has three high-level requirements:

azure-phishing-simulation-reporter ➜  /bin/cat Pipfile
...

[packages]
requests = "*"
boto3 = "*"
msal = "*"

...

Looking at the requirements directory, it is clear that botocore (dep for boto3) and cryptography (dep for all three), are the main issue:

azure-phishing-simulation-reporter ➜  du -s .serverless/requirements/* | sort -nr
154592  .serverless/requirements/botocore
31528   .serverless/requirements/cryptography
2088    .serverless/requirements/boto3
1264    .serverless/requirements/pycparser
976 .serverless/requirements/dateutil
936 .serverless/requirements/charset_normalizer
920 .serverless/requirements/urllib3
800 .serverless/requirements/cffi
632 .serverless/requirements/msal
600 .serverless/requirements/s3transfer
576 .serverless/requirements/certifi
568 .serverless/requirements/idna
440 .serverless/requirements/requests
424 .serverless/requirements/_cffi_backend.cpython-39-darwin.so
160 .serverless/requirements/jwt
160 .serverless/requirements/jmespath
72  .serverless/requirements/six.py
16  .serverless/requirements/bin
8   .serverless/requirements/requirements.txt

While possible, I have read that it is bad practice to rely on the AWS Lambda runtime to supply boto3 library and the requests library bundled with botocore: https://github.com/serverless/serverless-python-requirements/issues/304#issuecomment-455359902

Hi all. I’m a SA for AWS.

Our leading practice is to ship your own version of boto with your app code either as a part of the handler zip or as a layer (for larger projects).

Now to what I have attempted, which is virtually every option native to the serverless-python-requirements plugin (https://www.serverless.com/plugins/serverless-python-requirements):

  • zip
  • slim
  • layers

I've also tried to exclude files/directories via the following block in serverless.yml:

package:
  patterns:
    - '!.git/**'
    - '!test/**'
    - '!e2e/**'
    - '!src/**'
    - '!node_modules/**'
    - '!venv/**'
    - '!__pycache__/**'
    - '!requirements.txt'
    - '!Pipfile'
    - '!Pipfile.lock'
    - '!README.md'

Nothing works. I feel like this should be an easy problem to solve for, and find it frustrating that AWS' own libraries are basically responsible for causing bloated deployment packages which don't fit within their own size restrictions. I had high hopes for using layers, and was shocked to see that the layer was tiny in comparison to the deployment package:

Uploading service azure-phishing-simulation-reporter.zip file to S3 (119.5 MB)

Uploading service pythonRequirements.zip file to S3 (17.77 MB)

Guidance on the proper way to circumvent this is appreciated!


Solution

  • Answer: I am an idiot.

    See if you can spot the issue:

    # .gitlab-ci.yml
    
    .deploy_script: &deploy_script
      - /usr/local/bin/python -m pip install --upgrade pip 
      - pip install pipenv
      - npm install -g serverless@3
      - npm install
      - export NODE_OPTIONS=--max_old_space_size=4096
    
    deploy_to_dev:
      stage: deploy
      script:
        - curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
        - unzip awscliv2.zip
        - ./aws/install
        - export AWS_ACCESS_KEY_ID=$MGMT_AWS_ACCESS_KEY_ID
        - export AWS_SECRET_ACCESS_KEY=$MGMT_AWS_SECRET_ACCESS_KEY
        - export AWS_DEFAULT_REGION=us-east-1
        - aws configure list
        - aws sts get-caller-identity | tee
        - *deploy_script
        - serverless deploy --stage dev --verbose
      only:
        - main
    

    I was having issues previously with a protected branch that wasn't allowing me to retrieve Gitlab CI/CD environment variables for AWS. As part of running down that issue, I implemented a few extra steps to the pipeline - to download, extract and install the aws cli, so that I could validate that I was getting and assuming an identity.

    Of course, the CI/CD pipeline drops you into a freshly checked out directory with your repo, and I was downloading and extracting IN that directory. So, those things were being repackaged back up with my serverless deploy, bloating the package significantly.

    After commenting out AWS CLI related actions:

    Uploading service azurePhishSimReportFunction.zip file to S3 (14.82 MB)

    It's always fun to find out the problem was you.

    Thanks to the folks who chimed in - I appreciate you taking a little bit of time out.