Search code examples
amazon-web-servicesamazon-s3aws-lambdaboto3

Is there a way I can copy a lambda code zip file directly to an S3 amazon bucket, without downloading anything on my local machine?


I'm using the boto3 client to get a download link for the zip file of a lambda function on AWS.

I want to transfer this zip file from that download link, directly to an s3 bucket, without storing or piping anything on my machine.

Is it possible to do this with the available AWS apis?

Edit: Can datasync perhaps help with this?


Solution

  • You could use the aws s3 cp - command that can stream from standard input to a specified bucket and key and combine this with aws lambda get-function.

    For example, this will stream your function's package directly to S3.

    curl $(aws lambda get-function --function-name <YOUR-FUCNTION-NAME> \
        | jq -r ".Code.Location") \
        | aws s3 cp - s3://<YOUR-BUCKET>/<YOUR-FUNCTION-NAME>.zip
    

    curl in this context does not save the file locally. Instead, it streams the data directly to stdout, which is then piped as stdin to the aws s3 cp - command.

    Or, if you're using boto3, you could combine it with requests with the stream set to True.

    Sample code:

    import boto3
    import requests
    from botocore.exceptions import NoCredentialsError
    
    
    def stream_lambda_to_s3(
            lambda_function_name: str,
            bucket_name: str,
            s3_key: str,
    ) -> None:
    
        lambda_client = boto3.client('lambda')
        s3_client = boto3.client('s3')
    
        response = lambda_client.get_function(FunctionName=lambda_function_name)
    
        presigned_url = response['Code']['Location']
    
        with requests.get(presigned_url, stream=True) as r:
            r.raise_for_status()
            try:
                s3_client.upload_fileobj(r.raw, bucket_name, s3_key)
                print(
                    f"Successfully uploaded {lambda_function_name} "
                    f"to s3://{bucket_name}/{s3_key}"
                )
            except NoCredentialsError:
                print("Credentials not available")
    
    
    if __name__ == "__main__":
        function_name = 'YOUR_LAMBDA_FUNCTION_NAME'
        target_bucket = 'YOUR_BUCKET_NAME'
        s3_key = f'{function_name}.zip'
        
        stream_lambda_to_s3(function_name, target_bucket, s3_key)
    

    Or you could use Go Upload Managers capabilities that provide concurrent upload of content to S3 by taking advantage of S3's Multipart APIs.

    The code has been adapted from this answer.

    I've added:

    • cli flags
    • http.Get to use the resp.Body instead of reading from a local file

    To build this, just run:

    go build -o stream main.go
    

    And then, use it like this:

    ./stream --lambda-name YOUR-LAMBDA --bucket-name YOUR-BUCKET --key-name YOUR-NAME.zip
    

    There's an additional flag --region, but this one defaults to eu-central-1. No need to supply it. However, if your region is different, feel free to change it.

    package main
    
    import (
        "flag"
        "fmt"
        "github.com/aws/aws-sdk-go/aws"
        "github.com/aws/aws-sdk-go/aws/session"
        "github.com/aws/aws-sdk-go/service/lambda"
        "github.com/aws/aws-sdk-go/service/s3/s3manager"
        "io"
        "net/http"
    )
    
    var lambdaName string
    var bucketName string
    var keyName string
    var defaultRegion string
    
    func main() {
        flag.StringVar(
            &lambdaName,
            "lambda-name",
            "",
            "Name of the Lambda function",
        )
        flag.StringVar(
            &bucketName,
            "bucket-name",
            "",
            "Name of the S3 bucket",
        )
        flag.StringVar(
            &keyName,
            "key-name",
            "",
            "Key name for the S3 object",
        )
        flag.StringVar(
            &defaultRegion,
            "region",
            "eu-central-1",
            "AWS Region",
        )
        flag.Parse()
    
        if lambdaName == "" || bucketName == "" || keyName == "" {
            fmt.Println("All flags are required.")
            return
        }
    
        var awsConfig *aws.Config
        awsConfig = &aws.Config{
            Region: aws.String(defaultRegion),
        }
    
        // Get Lambda function details
        sess := session.Must(session.NewSession(awsConfig))
        lambdaService := lambda.New(sess)
        lambdaOutput, err := lambdaService.GetFunction(&lambda.GetFunctionInput{
            FunctionName: &lambdaName,
        })
        if err != nil {
            fmt.Printf("Failed to fetch Lambda function details: %v\n", err)
            return
        }
    
        resp, err := http.Get(*lambdaOutput.Code.Location)
        if err != nil {
            fmt.Printf("Failed to fetch content from pre-signed URL: %v\n", err)
            return
        }
        defer func(Body io.ReadCloser) {
            err := Body.Close()
            if err != nil {
                fmt.Printf("Failed to close response body: %v\n", err)
            }
        }(resp.Body)
    
        // Create an uploader with the session and custom options
        uploader := s3manager.NewUploader(sess, func(u *s3manager.Uploader) {
            u.PartSize = 5 * 1024 * 1024
            u.Concurrency = 2
        })
    
        // Upload the streamed content to S3
        result, err := uploader.Upload(&s3manager.UploadInput{
            Bucket: &bucketName,
            Key:    &keyName,
            Body:   resp.Body,
        })
    
        if err != nil {
            fmt.Printf("Failed to upload content to S3: %v\n", err)
            return
        }
        fmt.Printf("File uploaded to: %s\n", result. Location)
    }
    

    Note on datasync. No, it can't do what you want. From the FAQ:

    AWS DataSync supports moving data to, from, or between Amazon Simple Storage Service (Amazon S3), Amazon Elastic File System (Amazon EFS), Amazon FSx for Windows File Server, Amazon FSx for Lustre, Amazon FSx for OpenZFS, and Amazon FSx for NetApp ONTAP.

    source