Search code examples
securityjwttokenjks

RSA JWT key rotation period?


I have created a basic JWT generator but need advice on a couple of aspects. I have been using JWT.io 's guides and the auth0/java-jwt libraries/repo to produce the tokens.

The JWTs are being signed with 2 different keys.

The refresh tokens are being signed using the RSA512 algorithm using a 4096bit key.

.sign(Algorithm.RSA512(rsaPublicKey, rsaPrivateKey));

The access tokens are being signed with a 1024 bit RSA key through the RSA256 algorithm.

 .sign(Algorithm.RSA256(rsaPublicKey, rsaPrivateKey));

I have gone for this with the recommendations on "speed" as a 4096 bit verification process takes longer, but seeming as there a fewer requests for refresh tokens, the trade off for security seems fair.

On the other hand, the access tokens are verified at the resource server endpoints, and they are sent much more often, so I have opted for a shorter (256) signature which was preformed with a quicker 1024bit key.

I know the keys are "practically" impossible to break...but key rotation is recommended?

I hold the jks (keystore) in a private folder on the auth server and the resource servers. The keystore holds the 2 keypairs, one for the refresh token sign/verify and one for the access token sign/verify.

Do I need to refresh/form new keys? If so...how often? What is the recommended way to do this?

There can be multiple instances of the auth and resource microservices behind load balancers...so RAM generated keys are a no as they will not propagate between instances.

I have had a look at maybe having a "key-server" that can say create new keys and append them to the keystore and dish out the new jks file to update with new keypairs... similar to this: File Sharing Between Instances

So for example, every 15 seconds, the EC2 auth servers and resource servers ping the key-server requesting a copy of the current jks (and version check).

Any recommendations?

Thank you!


Solution

  • JWT RSA Key sizing

    Correct the RSA keys to be 2048 bits, that is the current recommended size (year 2020).

    1024 bits RSA keys are considered weak and have been prohibited from usage by NIST when dealing with highly confidential information. (Tip: the central auth system is as confidential as it gets). It can be cracked given enough compute power, bearing in mind than any large organization has access to datacenters with 10k+ CPUs.

    4096 keys are possible, but can be 10 times slower to verify than 2048 (complexity is not linear with size). Consider performance impact carefully. Authentication tokens WILL be used everywhere and verified bazillions of times.

    See related answer on What RSA key length should I use for my SSL certificates?

    JWT Key Rotation

    Assuming JWT usage along OpenID Connect (OIDC).

    The active JWT public keys can be obtained from the OIDC server, on an endpoint like /.well-known/keys. Refer to the documentation of your OIDC server.

    Applications should retrieve public keys on startup and refresh them periodically. There isn't a formal standard on how often?

    • Common practice is to retrieve keys periodically between 1 hour and 1 week.
    • Applications that restart automatically periodically (web containers) might load keys upon startup and not actively refresh them during run.
    • Servers usually have a scheduled reboot cycle (maybe monthly or quarterly), putting an upper limit on how long anything can run.
    • One example: the Apache plugin mod_auth_openidc retrieves keys hourly by default. Setting OIDCUserInfoRefreshInterval

    Existing tokens are invalidated when their signing key is rotated off, and new tokens are not accepted if applications did not keep up with the newer signing key. So there are bounds to consider for things to work well.

    • Common practice is to rotate keys periodically between 1 to 12 months.
    • Okta provides their examples with 90 days keys.
    • A website like Facebook almost never requires users to re-authenticate (months? years? have you ever had to login again?) so signing keys there have to last for months, whereas a banking website doesn't need to support multiple months sessions.
    • There is generally no point in rotating keys more frequently than monthly. It only highlights subtle issues with software not reloading often-enough and prevents from having "long" sessions.

    My personal recommendation to ensure maximum security and minimum hassle, having managed single-sign-on for thousands of applications across thousands of systems in large organizations.

    • Signing keys are valid for 1 year.
    • Signing keys are rotated every 6 months.
    • Meaning there are at least 2 keys available from /.../keys at all time. One active key and one future key pending to replace it.
    • benefits:
    • This leaves applications ample time to pick up the next key (6 months), whether by actively refreshing or by passive restarts.
    • 6 months is long enough that keys can be hardcoded into libraries/applications for special use cases that require it. For example we have HPC-like compute clusters deploying 10000 tasks/processes at once, that might DDoS the shit out of the OIDC server if every single one of them tried to fetch keys remotely on startup.
    • Gotta rotate often enough (6 months absolute top) for anything to work and be tested. If a developer performs some integrations and doesn't handle rotation well, it's gonna blow up within 6 months and they can fix it (hopefully still in testing phase or with limited users). If the rotation happens after 2 years instead, nobody will notice it's gonna break until it breaks and there's nobody to fix it, all the original developers having long left.

    DDoS

    By the way, timelines are never in seconds, it's funny the question mentions seconds.

    An authentication system is depended upon by everything in a company, when "everything" (thousands of services) is trying to ping the same service every few seconds (or even few minutes), that's a quick way to understand the concept of permanent accidental DDoS.

    One of the primary goals of JWT was precisely to not need a central service to verify tokens (a massive constantly-loaded single-point-of-failure). You can embrace the goal of limiting dependencies by only loading signing keys -remotely- once on startup (assuming the services you run are restarting periodically).