Search code examples
amazon-web-servicesamazon-s3proxyquery-stringamazon-cloudfront

Configure CloudFront to proxy query string as part of S3 object key?


We have AWS CloudFront configured to proxy requests onto an S3 bucket.

AWS CloudFront allows you to configure it so that query string params are also proxied onto the origin server (docs) - we have this configured as well.

The problem I have is that the query string param doesn't appear to be included in the S3 object key lookup.

So for example, we have object keys like:

  • foo
  • foo?a=1
  • foo?b=2

But if we make a request like https://cf-distro-url.com/foo?a=1 or https://cf-distro-url.com/foo?b=2 we actually end up with the /foo content and not the specific key content.

It looks like maybe CloudFront is proxying on the query string params but they're not used as part of the S3 object key lookup process.

Has anyone experienced this before? and/or know of a solution?

Thanks!


Solution

  • Using a web browser, neither CloudFront -- nor S3, if accessed directly -- will work as you are expecting with ? in an object key, because an unescaped ? is the delimiter for the query string in HTTP.

    By definition, that ? marks the end of the path, and thus the end of the object key.

    The only way to fetch an object with ? in the key is for the browser to send the request with the ? escaped as %3F, which means the URL has to be originally presented to the browser that way.

    You can't upload an object to S3 with a ? in its key name and subsequently access it from a browser with the literal ? in the URL.

    Demonstrated with curl, but browser behavior is the same.

    With /foo?bar in the URI:

    $ curl -v 'http://.....s3.amazonaws.com/foo?bar=1'
    * About to connect() to .....s3.amazonaws.com port 80 (#0)
    *   Trying x.x.x.x... connected
    > GET /foo?bar=1 HTTP/1.1
    
    < HTTP/1.1 404 Not Found
    
    <?xml version="1.0" encoding="UTF-8"?>
    <Error>
      <Code>NoSuchKey</Code>
      <Message>The specified key does not exist.</Message>
      <Key>foo</Key>                                      <<< requested object is "foo"
      <RequestId>...</RequestId>
      <HostId>...</HostId>
    </Error>
    

    With /foo%3Fbar=1 in the URI:

    $ curl -v 'http://.....s3.amazonaws.com/foo%3Fbar=1'
    * About to connect() to .....s3.amazonaws.com port 80 (#0)
    *   Trying x.x.x.x... connected
    > GET /foo%3Fbar=1 HTTP/1.1
    
    < HTTP/1.1 404 Not Found
    
    <?xml version="1.0" encoding="UTF-8"?>
    <Error>
      <Code>NoSuchKey</Code>
      <Message>The specified key does not exist.</Message>
      <Key>foo?bar=1</Key>                                <<< requested object is "foo?bar=1"
      <RequestId>...</RequestId>
      <HostId>...</HostId>
    </Error>
    

    This is a limitation in the way URLs are formed, not in S3 (or CloudFront)... but it is mentioned in the S3 documentation:

    The following characters in a key name may require additional code handling and will likely need to be URL encoded or referenced as HEX

    ...

    Question mark ("?")

    http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-keys

    If you're able to upload such objects correctly with an SDK, then it is escaping the characters for you. A web browser will not do that implicitly, because ? has a specific meaning.

    Whether or not CloudFront is passing the query string to S3, the net behavior is the same -- the query string is not part of the key.