Search code examples
oauth-2.0authorizationopenid-connectistiorbac

Istio AuthorizationPolicy returning 403 after login flow using Oauth2-Proxy and Dex


I am trying to use OAuth2-Proxy with an Istio AuthorizationPolicy to handle login and authorization for an application running on AKS. As it stands, when I hit my application endpoint in a browser (httpbin.example.com), I'm successfully redirected to Dex, and I'm able to login using Dex (using local db username/password) and then get redirected back to my app. But at this point I get a 403 error reading "RBAC: Access Denied". That message, I believe, is coming from the Istio AuthorizationPolicy.

Istio version: 1.18.2

As it stands, my AuthorizationPolicy looks like this:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: oauth-policy
  namespace: istio-system
spec:
  selector:
    matchLabels:
      istio: ingressgateway
  action: CUSTOM
  provider:
    name: "oauth2-proxy"
  rules:
  # The rules specify when to trigger the external authorizer.
  - to:
    - operation:
        hosts:
        - "httpbin.example.com"

and my configuration for the CUSTOM provider looks like this:

meshConfig:
    extensionProviders:
    - name: "oauth2-proxy"
      envoyExtAuthzHttp:
        service: "oauth2-proxy.demo.svc.cluster.local"
        port: "80"
        headersToDownstreamOnDeny:
          - content-type
          - set-cookie
        headersToUpstreamOnAllow:
          - authorization
          - cookie
          - path
          - x-auth-request-access-token
          - x-forwarded-access-token
        includeHeadersInCheck:
          - "cookie"
          - "x-forwarded-access-token"
          - "x-forwarded-user"
          - "x-forwarded-email"
          - "authorization"
          - "x-forwarded-proto"
          - "proxy-authorization"
          - "user-agent"
          - "x-forwarded-host"
          - "from"
          - "x-forwarded-for"
          - "accept"
        includeAdditionalHeadersInCheck:
          authorization: '%REQ(x-auth-request-access-token)%'

And my config for OAuth2-Proxy looks like this:

# Oauth client configuration specifics
config:
  # OAuth client ID

  clientID: "oauth2-proxy"
  # OAuth client secret
  clientSecret: "ZXhhbXBsZS1hcHAtc2VjcmV0"

  cookieSecure: true
  cookieSecret: "b311562c684c75e497b4fb3f08c3deea"
  # The name of the cookie that oauth2-proxy will create
  # If left empty, it will default to the release name
  cookieName: "_oauth2_proxy"

  configFile: |-
    email_domains = [ "*" ]
    upstreams = [ "static://200" ]
    provider = "oidc"
    cookie_refresh = "5m"
    cookie_expire = "4h"
    cookie_domains = [ "httpbin.example.com" ]
    cookie_samesite = "lax"
    set_xauthrequest = true
    set_authorization_header = true
    pass_authorization_header = true
    pass_host_header = true
    pass_access_token = true
    skip_jwt_bearer_tokens = true
    reverse_proxy = true
    skip_provider_button = true
    http_address = "0.0.0.0:4180"
    silence_ping_logging = true
    oidc_issuer_url = "http://dex.example.com"
    scope = "openid"
    oidc_email_claim = "sub"
    real_client_ip_header = "X-Forwarded-For"

I see this in the Dex logs after an attempt, which suggests that everything is good up the point that Dex returns at least:

time="2023-11-06T18:10:02Z" level=info msg="login successful: connector \"local\", username=\"admin\", preferred_username=\"\", email=\"[email protected]\", groups=[]"

And in turn I get this in the OAuth2-Proxy logs:

[2023/11/06 18:09:51] [oauthproxy.go:970] No valid authentication in request. Initiating login.
10.215.1.230 - 7e0f3851-d605-4e71-b9e1-335dabad9397 - - [2023/11/06 18:09:51] httpbin.example.com GET - "/" HTTP/1.1 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 302 270 0.000
10.215.1.230 - 18df84c5-2636-4f0d-9278-51035f8db86c - CiQwOGE4Njg0Yi1kYjg4LTRiNzMtOTBhOS0zY2QxNjYxZjU0NjYSBWxvY2Fs [2023/11/06 18:10:02] [AuthSuccess] Authenticated via OAuth2: Session{email:CiQwOGE4Njg0Yi1kYjg4LTRiNzMtOTBhOS0zY2QxNjYxZjU0NjYSBWxvY2Fs user:CiQwOGE4Njg0Yi1kYjg4LTRiNzMtOTBhOS0zY2QxNjYxZjU0NjYSBWxvY2Fs PreferredUsername: token:true id_token:true created:2023-11-06 18:10:02.356686277 +0000 UTC m=+2538.147613649 expires:2023-11-07 18:10:01.353689544 +0000 UTC m=+88937.144616916}
10.215.1.230 - 18df84c5-2636-4f0d-9278-51035f8db86c - - [2023/11/06 18:10:02] httpbin.example.com GET - "/oauth2/callback?code=nomx6zwp4e2owq6ys6i26yji5&state=EhrfRhvb9g2zQfB1lGanyMaGJejNmegAo9Y_ejeAenA%3A%2F" HTTP/1.1 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 302 24 0.040
10.215.1.230 - 64574381-e3e1-486c-864c-f6a708b94fc6 - CiQwOGE4Njg0Yi1kYjg4LTRiNzMtOTBhOS0zY2QxNjYxZjU0NjYSBWxvY2Fs [2023/11/06 18:10:02] httpbin.example.com GET static://200 "/" HTTP/1.1 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 200 13 0.000
10.215.1.230 - ecfb0cb8-bf41-4362-b914-46ecf5820de7 - CiQwOGE4Njg0Yi1kYjg4LTRiNzMtOTBhOS0zY2QxNjYxZjU0NjYSBWxvY2Fs [2023/11/06 18:10:02] httpbin.example.com GET static://200 "/favicon.ico" HTTP/1.1 "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" 200 13 0.000

The bit reading [AuthSuccess] Authenticated via OAuth2: would lead me to believe that things are good with OAuth2-Proxy, but maybe there's something I'm missing here.

The envoy-proxy logs for the httpbin pod don't seem to contain anything relevant at first blush:

2023-11-06T17:27:32.275893Z info    JWT policy is third-party-jwt
2023-11-06T17:27:32.275902Z info    using credential fetcher of JWT type in cluster.local trust domain
2023-11-06T17:27:32.276032Z info    platform detected is Azure
2023-11-06T17:27:32.285100Z warn    HTTP request unsuccessful with status: 400 Bad Request
2023-11-06T17:27:32.300832Z info    Workload SDS socket not found. Starting Istio SDS Server
2023-11-06T17:27:32.300859Z info    CA Endpoint istiod.istio-system.svc:15012, provider Citadel
2023-11-06T17:27:32.300877Z info    Using CA istiod.istio-system.svc:15012 cert with certs: var/run/secrets/istio/root-cert.pem
2023-11-06T17:27:32.301010Z info    Opening status port 15020
2023-11-06T17:27:32.317004Z info    ads All caches have been synced up in 46.961061ms, marking server ready
2023-11-06T17:27:32.317243Z info    xdsproxy    Initializing with upstream address "istiod.istio-system.svc:15012" and cluster "Kubernetes"
2023-11-06T17:27:32.317285Z info    sds Starting SDS grpc server
2023-11-06T17:27:32.317672Z info    starting Http service at 127.0.0.1:15004
2023-11-06T17:27:32.318984Z info    Pilot SAN: [istiod.istio-system.svc]
2023-11-06T17:27:32.319990Z info    Starting proxy agent
2023-11-06T17:27:32.320018Z info    starting
2023-11-06T17:27:32.320043Z info    Envoy command: [-c etc/istio/proxy/envoy-rev.json --drain-time-s 45 --drain-strategy immediate --local-address-ip-version v4 --file-flush-interval-msec 1000 --disable-hot-restart --allow-unknown-static-fields --log-format %Y-%m-%dT%T.%fZ   %l  envoy %n %g:%#  %v  thread=%t -l warning --component-log-level misc:error --concurrency 16]
2023-11-06T17:27:32.427876Z info    xdsproxy    connected to upstream XDS server: istiod.istio-system.svc:15012
2023-11-06T17:27:32.457010Z info    ads ADS: new connection for node:httpbin-7987fb4cc7-7vr9x.demo-1
2023-11-06T17:27:32.457262Z info    ads ADS: new connection for node:httpbin-7987fb4cc7-7vr9x.demo-2
2023-11-06T17:27:32.523788Z info    cache   generated new workload certificate  latency=206.479426ms ttl=23h59m59.476221912s
2023-11-06T17:27:32.523823Z info    cache   Root cert has changed, start rotating root cert
2023-11-06T17:27:32.523840Z info    ads XDS: Incremental Pushing:0 ConnectedEndpoints:2 Version:
2023-11-06T17:27:32.523888Z info    cache   returned workload trust anchor from cache   ttl=23h59m59.476113611s
2023-11-06T17:27:32.523920Z info    cache   returned workload certificate from cache    ttl=23h59m59.47608181s
2023-11-06T17:27:32.524062Z info    cache   returned workload trust anchor from cache   ttl=23h59m59.475948309s
2023-11-06T17:27:32.524200Z info    ads SDS: PUSH request for node:httpbin-7987fb4cc7-7vr9x.demo resources:1 size:4.0kB resource:default
2023-11-06T17:27:32.524272Z info    ads SDS: PUSH request for node:httpbin-7987fb4cc7-7vr9x.demo resources:1 size:1.1kB resource:ROOTCA
2023-11-06T17:27:32.524358Z info    cache   returned workload trust anchor from cache   ttl=23h59m59.475645506s
2023-11-06T17:27:33.858345Z info    Readiness succeeded in 1.594680949s
2023-11-06T17:27:33.858740Z info    Envoy proxy is ready
2023-11-06T17:55:26.183500Z info    xdsproxy    connected to upstream XDS server: istiod.istio-system.svc:15012

Likewise I don't see anything in the proxy logs for the Ingress Gateway that looks weird. No 403 errors or "rbac_acces_denied" stuff or anything. To a casual look, it appears that everything succeeded. But what my browser sees is that 403 "RBAC Access Denied" message.

2023-11-07T18:01:45.015545Z info    Readiness succeeded in 505.545447ms
2023-11-07T18:01:45.015936Z info    Envoy proxy is ready
[2023-11-07T18:01:45.257Z] "GET /.well-known/openid-configuration HTTP/1.1" 200 - via_upstream - "-" 0 1135 14 14 "-" "Go-http-client/1.1" "d8108d75-c05a-40a7-aa0e-6f2fffb483c8" "dex.example.com" "10.215.1.235:80" PassthroughCluster 172.17.2.72:37518 10.215.1.235:80 172.17.2.72:55152 - allow_any
[2023-11-07T18:02:21.416Z] "GET / HTTP/1.1" 302 - via_upstream - "-" 0 270 1 0 "10.215.1.230,172.17.3.24" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" "c7dbc03c-cc6c-47f1-bed2-3bac1d40b544" "httpbin.example.com" "172.17.2.72:4180" inbound|4180|| 127.0.0.6:46533 172.17.2.72:4180 172.17.3.24:0 outbound_.80_._.oauth2-proxy.demo.svc.cluster.local default
[2023-11-07T18:02:49.886Z] "POST /token HTTP/1.1" 200 - via_upstream - "-" 127 1489 33 33 "-" "Go-http-client/1.1" "cd49734e-0ff8-4f0d-a884-dc86ab62ec11" "dex.example.com" "10.215.1.235:80" PassthroughCluster 172.17.2.72:37518 10.215.1.235:80 172.17.2.72:55152 - allow_any
[2023-11-07T18:02:49.921Z] "GET /keys HTTP/1.1" 200 - via_upstream - "-" 0 525 3 3 "-" "Go-http-client/1.1" "98c0b3db-c7a0-418b-a1d6-3d55ce5cedb9" "dex.example.com" "10.215.1.235:80" PassthroughCluster 172.17.2.72:37518 10.215.1.235:80 172.17.2.72:55152 - allow_any
[2023-11-07T18:02:49.925Z] "GET /userinfo HTTP/1.1" 200 - via_upstream - "-" 0 191 2 2 "-" "Go-http-client/1.1" "32729787-c856-4ff0-9519-32588a973288" "dex.example.com" "10.215.1.235:80" PassthroughCluster 172.17.2.72:37518 10.215.1.235:80 172.17.2.72:55152 - allow_any
[2023-11-07T18:02:49.885Z] "GET /oauth2/callback?code=lt6gnuhpx4enqsla2yvbwflv4&state=1XRICtrxCimKRC7ukPCSNLB6abHVAE1s0JPtmb2b4do%3A%2F HTTP/1.1" 302 - via_upstream - "-" 0 24 42 42 "10.215.1.230,172.17.3.24" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" "b31ee817-8c36-4d48-bd88-6e46ce232407" "httpbin.example.com" "172.17.2.72:4180" inbound|4180|| 127.0.0.6:46533 172.17.2.72:4180 172.17.3.24:0 outbound_.80_._.oauth2-proxy.demo.svc.cluster.local default
[2023-11-07T18:02:49.949Z] "GET / HTTP/1.1" 200 - via_upstream - "-" 0 13 0 0 "10.215.1.230,172.17.3.24" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" "5f4ed413-772c-49cd-93e2-0fc6f98b96a3" "httpbin.example.com" "172.17.2.72:4180" inbound|4180|| 127.0.0.6:46533 172.17.2.72:4180 172.17.3.24:0 outbound_.80_._.oauth2-proxy.demo.svc.cluster.local default

I'm at something of a dead-end here. Anybody have any thoughts or ideas for further debugging / experimentation?

EDIT:

One other thing I did, was to deploy the default sample ext auth provider and configure it, and drop an AuthorizationPolicy in place that uses that to gate access to the httpbin pod. In that case, everything works fine. If I leave off the magic header that allows access, I get back a 403, but if I include that header I can access the pod. So it appears that AuthorizationPolicies in the general sense are working, and that somehow what I'm getting back from OAuth2-Proxy at the end of the flow is not what the AuthorizationPolicy is expecting for a "success" case.


Solution

  • SOLVED. In this specific case, the problem appears to have been related to having two AuthorizationPolicy objects bound to the same namespace, and that was colliding in a way that did not work. It turns out that the second AuthorizationPolicy was something I'd thrown in while working through some random tutorial, as an experiment, and really wasn't needed at all. Once I removed it from my deployment, everything worked as expected.