java spring-boot spring-security jwk spring-resource-server

Spring Boot Security Resource Server - how to reload a JWKset on JWT validation failure

Given the following code:

@Configuration
@EnableWebSecurity
class MyCustomSecurityConfiguration {
    @Bean
    open fun filterChain(http: HttpSecurity): SecurityFilterChain {
        http {
            authorizeRequests {
                authorize(anyRequest, authenticated)
            }
            oauth2ResourceServer {
                jwt {}
            }
        }
        return http.build()
    }
}

And this configuration:

spring:
  security:
    oauth2:
      resourceserver:
        jwt:
          jwk-set-uri: https://idp.example.com/.well-known/jwks.json

With this example every incoming request will be checked for a valid JWK by Spring Oauth 2.0 Resource Server dependency. This works but I do occasionaly see 401's. I know this happens because the JWKset I'm loading is very dynamic. We use rotating keys for signing JWT's and keys are added and removed in the JWKset frequently.

By default Nimbus refreshes the in-memory JWKset every 5 minutes. I can lower the value to something shorter like 1 second if I want. However when I do this, I still see a small amount of 401's.

I'm looking for a way to refresh the JWKset on JWT Authentication failure in Spring Boot. Next I want to check the request again with the refreshed JWKset and test if it passes now. This will make the request slower but it is better than serving a 401.

I see no way to do this in de Spring Documentation. I also don't know how to "reprocess" a failed 401 request in Spring Security.

Hence my question, how can I do this?

Solution

Making more requests wont solve your problem. And in my opinion you are addressing the problem from the wrong standpoint.

Ramping up the number of fetches wont solve your problem. 401s will still happen.

You need to instead do proper overlapping of JWKs. So what do i mean by this.

Example:

Token_A is signed by JWK_A and is valid for 10 minutes.
Token_B is signed by JWK_B for 10 minutes.

What Nimbus fetches is a list of JWKs from the endpoint.

lets build a timeline:

Token_A is issued at 12:00 and expires at 12:10.
Nimbus fetches the JWK list and finds only JWK_A in the list and can validate the signage.
at 12:01 JWK_B is added to the list of JWKs.
HERE you should wait n minutes until you start issuing Token_B tokens so we know all caches have been updated
Nimbus updates its cache by fetching JWKs and now has 2 JWKs in the list, JWK_A and JWK_B.
Token_B gets issued at 12:07 (5 minute cache update)
at 12:11 we can remove JWK_A from the list as we know the last issued Token_A was issued from (last token issued's expiration date)

This is in my opinion a much better and more effective way of making sure the rotations runs smoothly, does not create any extra overhead. All we need to do is to add a new key and then wait n minutes for the cache update. This process can be automated, so key rotation can automatically be done lets say every hour or what ever you wish.

For instance, always have 2 JWKs going, and over 30 minutes, one JWK is replaced by the other, this way we always know that all services have either one, the other or both at any given time.