Search code examples
pythonamazon-web-servicesamazon-s3boto3

How to bypass expectation of S3 server 100-contiune response in Boto3 put_object method


I'm trying to get a simple boto3 client program running to put an object into a bucket on a working S3 server. I'm confident the server is up and configured properly since I can upload perfectly fine via curl and via the AWS Java SDK, etc. In fact, I can take the debug log lines from my failing boto3 script and convert them to a curl request and they work perfectly. I'm trying to upload a very small file, only 136 bytes in length.

However, my S3 server does NOT send 100 continue responses until after the body is sent. But boto3 and botocore expect a 100 response before passing the body in a put_object request. Furthermore, it seems that boto3 and botocore specify the content-length of the body in the initial set of headers, meaning the server is endlessly waiting for 136 bytes of contents to arrive. The result is that the server idles waiting for the body to be transmitted, and the client times out waiting for a 100 response that's never going to come. The file is never uploaded. This server is a commercial product, so I cannot change anything about its S3 or HTTP configurations. I'm stuck with the settings as they are.

Is there an easy way to bypass the 100 response code expectation in the client code, or force the transmission of the request body even if the 100 isn't returned?

Here's my boto3 code. Yes, I know sigV4 is better, but I have the same issue on V2 or V4, and while debugging with other programs, V2 is easier to produce signatures with. This is only a draft.

import boto3
import botocore
import logging
import warnings
from boto3.compat import PythonDeprecationWarning
from botocore.config import Config
import http.client as http_client

# Suppress deprecation warnings
warnings.filterwarnings("ignore", category=PythonDeprecationWarning)

# Configure logging
logging.basicConfig(level=logging.DEBUG)

# Configure to use Signature Version 2
config = Config(
    signature_version='s3',
    retries = {'max_attempts': 0},
    s3 = {
        'use_accelerate_endpoint': False,
        'expect_100_continue': False
    }
)

# Initialize the S3 client with the custom endpoint and configuration
s3 = boto3.client('s3', endpoint_url='http://<my-domain>:80', config=config)

# Upload the file
try:
    with open('jeremy_test.txt', 'rb') as data:
        s3.put_object(
            Bucket='a-bucket', 
            Key='jeremy_upload.txt', 
            Body=data.read(), 
            ContentType='text/plain'
        )
    print("File uploaded successfully.")
except Exception as e:
    print(f"An error occurred: {e}")

Here's the relevant log files showing what's happening:

DEBUG:botocore.hooks:Event request-created.s3.PutObject: calling handler <function add_retry_headers at 0x7f4c43cbb4d0>
DEBUG:botocore.endpoint:Sending http request: <AWSPreparedRequest stream_output=False, method=PUT, url=http://<my-domain>:80/a-bucket/jeremy_upload.txt, headers={'Content-Type': b'text/plain', 'User-Agent': b'Boto3/1.33.13 md/Botocore#1.33.13 ua/2.0 os/linux#4.18.19-100.fc27.x86_64 md/arch#x86_64 lang/python#3.7.11 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.33.13', 'Content-MD5': b'p5noto3k/PSRC+Krl8ivcQ==', 'Expect': b'100-continue', 'Date': b'Fri, 27 Sep 2024 03:13:32 GMT', 'Authorization': b'AWS rUNA7sEVArrRIaEp3zlA:/SY4HtAuh4x7edFBU5u8krHMGz4=', 'amz-sdk-invocation-id': b'616bbf1b-f8fb-448e-9c33-340c6d7b614c', 'amz-sdk-request': b'attempt=1', 'Content-Length': '136'}>
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): <my-domain>:80
send: b'PUT /a-bucket/jeremy_upload.txt HTTP/1.1\r\nHost: <my-domain>\r\nAccept-Encoding: identity\r\nContent-Type: text/plain\r\nUser-Agent: Boto3/1.33.13 md/Botocore#1.33.13 ua/2.0 os/linux#4.18.19-100.fc27.x86_64 md/arch#x86_64 lang/python#3.7.11 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.33.13\r\nContent-MD5: p5noto3k/PSRC+Krl8ivcQ==\r\nExpect: 100-continue\r\nDate: Fri, 27 Sep 2024 03:13:32 GMT\r\nAuthorization: AWS rUNA7sEVArrRIaEp3zlA:/SY4HtAuh4x7edFBU5u8krHMGz4=\r\namz-sdk-invocation-id: 616bbf1b-f8fb-448e-9c33-340c6d7b614c\r\namz-sdk-request: attempt=1\r\nContent-Length: 136\r\n\r\n'
DEBUG:botocore.awsrequest:Waiting for 100 Continue response.
reply: '\r\n'
DEBUG:botocore.hooks:Event needs-retry.s3.PutObject: calling handler <botocore.retryhandler.RetryHandler object at 0x7f4c46343e90>
An error occurred: Connection was closed before we received a valid response from endpoint URL: "http://<my-domain>:80/a-bucket/jeremy_upload.txt".

I've tried event handlers, and setting the config to not expect_100_continue. I've tried event handlers, but I'm very green at boto3/botocore, so I've had no luck.


Solution

  • You're out of luck if you want to use boto3, and I'm guessing that the other SDKs will eventually follow it.

    The API documentation says nothing about the possibility of receiving a 100 status code, although the examples do show an Expect: 100-continue request header. However, S3 User Guide does: the 100 status is intended as an optimization, to avoid sending request bodies multiple times in the event of a redirect.

    Unfortunately, botocore unconditionally adds the Expect header to the request (in handlers.py add_expect_header(), which is invoked based on a table of request actions. Then, if the header is set, it waits for the 100 response.

    So, either you need to update your server to make it behave like S3 (including the parts that are documented outside the API doc), or you need to fork botocore to make that header optional or remove it entirely.

    If you do the latter, maybe control it via a config setting and then submit the patch back to AWS.