Search code examples
pythonamazon-web-servicesweb-applicationsamazon-s3amazon-cloudfront

How to cope with the performance of generating signed URLs for accessing private content via CloudFront?


A common use case of AWS S3 and CloudFront is serving private content. The common solution is using signed CloudFront URLs to access private files stored using S3.

However, the generation of these URLs comes with a cost: computing the RSA signature of any given URL using a private key. For Python (or boto, AWS's Python SDK), the rsa (https://pypi.python.org/pypi/rsa) library is used for this task. On my late 2014 MBP, it takes about ~25ms per computation with a 2048-bit key.

This cost potentially impacts the scalability of an application that uses this approach for authorizing access to private content via CloudFront. Imagine multiple clients request for access to multiple files frequently at 25~30ms/req.

It seems to me that not much can be improve on the signature computation itself, though the rsa library mentioned above was last updated almost 1.5 years ago. I wonder if there are other techniques or designs that may optimize the performance of this process to achieve higher scalability. Or do we simply have to throw in more hardware and try to solve it in a brute force way?

One optimization can be making the API endpoint accept multiple file signings per request and return the signed URLs in bulk rather than dealing with them individually in separate requests, but the total time necessary for computing all those signatures is still there.


Solution

  • Use Signed Cookies

    When I use CloudFront with many private URLs, I prefer to use Signed Cookies when all the restrictions are met. This does not speed up the generation of signed cookies but it reduces the number of signing requests to be one per user until they expire.

    Tuning RSA Signature Generation

    I can imagine you may have requirements which render signed cookies as an invalid option. In that case I tried to speed up the signing by comparing the RSA module used with boto and cryptography. Two additional alternative options are m2crypto and pycrypto but for this example I will use cryptography.

    In order to test performance of signing URLs with different modules I reduced the method _sign_string to remove any logic except the signing of a string then created a new Distribution class. Then I took the private key and example URL from boto tests to test with.

    The results show that cryptography is quicker but still requires close to 1ms per signing request. These results are skewed higher by iPython's use of scoped variables in timing.

    timeit -n10000 rsa_distribution.create_signed_url(url, message, expire_time)
    10000 loops, best of 3: 6.01 ms per loop
    
    timeit -n10000 cryptography_distribution.create_signed_url(url, message, expire_time)
    10000 loops, best of 3: 644 µs per loop
    

    The full script:

    from cryptography.hazmat.primitives.asymmetric import padding
    from cryptography.hazmat.primitives import serialization
    from cryptography.hazmat.backends import default_backend
    from cryptography.hazmat.primitives import hashes
    
    import rsa
    
    from boto.cloudfront.distribution import Distribution
    
    from textwrap import dedent
    
    # The private key provided in the Boto tests
    pk_key = dedent("""
        -----BEGIN RSA PRIVATE KEY-----
        MIICXQIBAAKBgQDA7ki9gI/lRygIoOjV1yymgx6FYFlzJ+z1ATMaLo57nL57AavW
        hb68HYY8EA0GJU9xQdMVaHBogF3eiCWYXSUZCWM/+M5+ZcdQraRRScucmn6g4EvY
        2K4W2pxbqH8vmUikPxir41EeBPLjMOzKvbzzQy9e/zzIQVREKSp/7y1mywIDAQAB
        AoGABc7mp7XYHynuPZxChjWNJZIq+A73gm0ASDv6At7F8Vi9r0xUlQe/v0AQS3yc
        N8QlyR4XMbzMLYk3yjxFDXo4ZKQtOGzLGteCU2srANiLv26/imXA8FVidZftTAtL
        viWQZBVPTeYIA69ATUYPEq0a5u5wjGyUOij9OWyuy01mbPkCQQDluYoNpPOekQ0Z
        WrPgJ5rxc8f6zG37ZVoDBiexqtVShIF5W3xYuWhW5kYb0hliYfkq15cS7t9m95h3
        1QJf/xI/AkEA1v9l/WN1a1N3rOK4VGoCokx7kR2SyTMSbZgF9IWJNOugR/WZw7HT
        njipO3c9dy1Ms9pUKwUF46d7049ck8HwdQJARgrSKuLWXMyBH+/l1Dx/I4tXuAJI
        rlPyo+VmiOc7b5NzHptkSHEPfR9s1OK0VqjknclqCJ3Ig86OMEtEFBzjZQJBAKYz
        470hcPkaGk7tKYAgP48FvxRsnzeooptURW5E+M+PQ2W9iDPPOX9739+Xi02hGEWF
        B0IGbQoTRFdE4VVcPK0CQQCeS84lODlC0Y2BZv2JxW3Osv/WkUQ4dslfAQl1T303
        7uwwr7XTroMv8dIFQIPreoPhRKmd/SbJzbiKfS/4QDhU
        -----END RSA PRIVATE KEY-----""")
    
    # Initializing keys in a global context
    cryptography_private_key = serialization.load_pem_private_key(
        pk_key,
        password=None,
        backend=default_backend())
    
    
    # Instantiate a signer object using PKCS 1v 15, this is not recommended but required for Amazon
    def sign_with_cryptography(message):
        signer = cryptography_private_key.signer(
            padding.PKCS1v15(),
            hashes.SHA1())
    
        signer.update(message)
        return signer.finalize()
    
    
    # Initializing the key in a global context
    rsa_private_key = rsa.PrivateKey.load_pkcs1(pk_key)
    
    
    def sign_with_rsa(message):
        signature = rsa.sign(str(message), rsa_private_key, 'SHA-1')
    
        return signature
    
    
    # All this information comes from the Boto tests.
    url = "http://d604721fxaaqy9.cloudfront.net/horizon.jpg?large=yes&license=yes"
    expected_url = "http://d604721fxaaqy9.cloudfront.net/horizon.jpg?large=yes&license=yes&Expires=1258237200&Signature=Nql641NHEUkUaXQHZINK1FZ~SYeUSoBJMxjdgqrzIdzV2gyEXPDNv0pYdWJkflDKJ3xIu7lbwRpSkG98NBlgPi4ZJpRRnVX4kXAJK6tdNx6FucDB7OVqzcxkxHsGFd8VCG1BkC-Afh9~lOCMIYHIaiOB6~5jt9w2EOwi6sIIqrg_&Key-Pair-Id=PK123456789754"
    message = "PK123456789754"
    expire_time = 1258237200
    
    
    class CryptographyDistribution(Distribution):
        def _sign_string(
                self,
                message,
                private_key_file=None,
                private_key_string=None):
            return sign_with_cryptography(message)
    
    
    class RSADistribution(Distribution):
        def _sign_string(
                self,
                message,
                private_key_file=None,
                private_key_string=None):
            return sign_with_rsa(message)
    
    
    cryptography_distribution = CryptographyDistribution()
    rsa_distribution = RSADistribution()
    
    cryptography_url = cryptography_distribution.create_signed_url(
        url,
        message,
        expire_time)
    
    rsa_url = rsa_distribution.create_signed_url(
        url,
        message,
        expire_time)
    
    assert cryptography_url == rsa_url == expected_url, "URLs do not match"
    

    Conclusion

    Although the cryptography module performs better in this test, I recommend trying to find a way to utilize signed cookies but I hope this information is useful.