A common use case of AWS S3 and CloudFront is serving private content. The common solution is using signed CloudFront URLs to access private files stored using S3.
However, the generation of these URLs comes with a cost: computing the RSA signature of any given URL using a private key. For Python (or boto
, AWS's Python SDK), the rsa
(https://pypi.python.org/pypi/rsa) library is used for this task. On my late 2014 MBP, it takes about ~25ms per computation with a 2048-bit key.
This cost potentially impacts the scalability of an application that uses this approach for authorizing access to private content via CloudFront. Imagine multiple clients request for access to multiple files frequently at 25~30ms/req.
It seems to me that not much can be improve on the signature computation itself, though the rsa
library mentioned above was last updated almost 1.5 years ago. I wonder if there are other techniques or designs that may optimize the performance of this process to achieve higher scalability. Or do we simply have to throw in more hardware and try to solve it in a brute force way?
One optimization can be making the API endpoint accept multiple file signings per request and return the signed URLs in bulk rather than dealing with them individually in separate requests, but the total time necessary for computing all those signatures is still there.
Use Signed Cookies
When I use CloudFront with many private URLs, I prefer to use Signed Cookies when all the restrictions are met. This does not speed up the generation of signed cookies but it reduces the number of signing requests to be one per user until they expire.
Tuning RSA Signature Generation
I can imagine you may have requirements which render signed cookies as an invalid option. In that case I tried to speed up the signing by comparing the RSA module used with boto and cryptography. Two additional alternative options are m2crypto and pycrypto but for this example I will use cryptography.
In order to test performance of signing URLs with different modules I reduced the method _sign_string to remove any logic except the signing of a string then created a new Distribution
class. Then I took the private key and example URL from boto tests to test with.
The results show that cryptography is quicker but still requires close to 1ms per signing request. These results are skewed higher by iPython's use of scoped variables in timing.
timeit -n10000 rsa_distribution.create_signed_url(url, message, expire_time)
10000 loops, best of 3: 6.01 ms per loop
timeit -n10000 cryptography_distribution.create_signed_url(url, message, expire_time)
10000 loops, best of 3: 644 µs per loop
The full script:
from cryptography.hazmat.primitives.asymmetric import padding
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import hashes
import rsa
from boto.cloudfront.distribution import Distribution
from textwrap import dedent
# The private key provided in the Boto tests
pk_key = dedent("""
-----BEGIN RSA PRIVATE KEY-----
MIICXQIBAAKBgQDA7ki9gI/lRygIoOjV1yymgx6FYFlzJ+z1ATMaLo57nL57AavW
hb68HYY8EA0GJU9xQdMVaHBogF3eiCWYXSUZCWM/+M5+ZcdQraRRScucmn6g4EvY
2K4W2pxbqH8vmUikPxir41EeBPLjMOzKvbzzQy9e/zzIQVREKSp/7y1mywIDAQAB
AoGABc7mp7XYHynuPZxChjWNJZIq+A73gm0ASDv6At7F8Vi9r0xUlQe/v0AQS3yc
N8QlyR4XMbzMLYk3yjxFDXo4ZKQtOGzLGteCU2srANiLv26/imXA8FVidZftTAtL
viWQZBVPTeYIA69ATUYPEq0a5u5wjGyUOij9OWyuy01mbPkCQQDluYoNpPOekQ0Z
WrPgJ5rxc8f6zG37ZVoDBiexqtVShIF5W3xYuWhW5kYb0hliYfkq15cS7t9m95h3
1QJf/xI/AkEA1v9l/WN1a1N3rOK4VGoCokx7kR2SyTMSbZgF9IWJNOugR/WZw7HT
njipO3c9dy1Ms9pUKwUF46d7049ck8HwdQJARgrSKuLWXMyBH+/l1Dx/I4tXuAJI
rlPyo+VmiOc7b5NzHptkSHEPfR9s1OK0VqjknclqCJ3Ig86OMEtEFBzjZQJBAKYz
470hcPkaGk7tKYAgP48FvxRsnzeooptURW5E+M+PQ2W9iDPPOX9739+Xi02hGEWF
B0IGbQoTRFdE4VVcPK0CQQCeS84lODlC0Y2BZv2JxW3Osv/WkUQ4dslfAQl1T303
7uwwr7XTroMv8dIFQIPreoPhRKmd/SbJzbiKfS/4QDhU
-----END RSA PRIVATE KEY-----""")
# Initializing keys in a global context
cryptography_private_key = serialization.load_pem_private_key(
pk_key,
password=None,
backend=default_backend())
# Instantiate a signer object using PKCS 1v 15, this is not recommended but required for Amazon
def sign_with_cryptography(message):
signer = cryptography_private_key.signer(
padding.PKCS1v15(),
hashes.SHA1())
signer.update(message)
return signer.finalize()
# Initializing the key in a global context
rsa_private_key = rsa.PrivateKey.load_pkcs1(pk_key)
def sign_with_rsa(message):
signature = rsa.sign(str(message), rsa_private_key, 'SHA-1')
return signature
# All this information comes from the Boto tests.
url = "http://d604721fxaaqy9.cloudfront.net/horizon.jpg?large=yes&license=yes"
expected_url = "http://d604721fxaaqy9.cloudfront.net/horizon.jpg?large=yes&license=yes&Expires=1258237200&Signature=Nql641NHEUkUaXQHZINK1FZ~SYeUSoBJMxjdgqrzIdzV2gyEXPDNv0pYdWJkflDKJ3xIu7lbwRpSkG98NBlgPi4ZJpRRnVX4kXAJK6tdNx6FucDB7OVqzcxkxHsGFd8VCG1BkC-Afh9~lOCMIYHIaiOB6~5jt9w2EOwi6sIIqrg_&Key-Pair-Id=PK123456789754"
message = "PK123456789754"
expire_time = 1258237200
class CryptographyDistribution(Distribution):
def _sign_string(
self,
message,
private_key_file=None,
private_key_string=None):
return sign_with_cryptography(message)
class RSADistribution(Distribution):
def _sign_string(
self,
message,
private_key_file=None,
private_key_string=None):
return sign_with_rsa(message)
cryptography_distribution = CryptographyDistribution()
rsa_distribution = RSADistribution()
cryptography_url = cryptography_distribution.create_signed_url(
url,
message,
expire_time)
rsa_url = rsa_distribution.create_signed_url(
url,
message,
expire_time)
assert cryptography_url == rsa_url == expected_url, "URLs do not match"
Conclusion
Although the cryptography module performs better in this test, I recommend trying to find a way to utilize signed cookies but I hope this information is useful.