Search code examples
kuberneteswatchrbacargo-workflows

I have an RBAC problem, but everything I test seems ok?


This is a continuation of the problem described here (How do I fix a role-based problem when my role appears to have the correct permissions?)

I have done much more testing and still do not understand the error

Error from server (Forbidden): pods is forbidden: User "dma" cannot list resource "pods" in API group "" at the cluster scope

UPDATE: Here is another hint from the API server

watch chan error: etcdserver: mvcc: required revision has been compacted

I found this thread, but I am working in the current kubernetes How fix this error "watch chan error: etcdserver: mvcc: required revision has been compacted"?

My user exists

NAME   AGE   SIGNERNAME                            REQUESTOR          REQUESTEDDURATION   CONDITION
dma    77m   kubernetes.io/kube-apiserver-client   kubernetes-admin   <none>              Approved,Issued

The clusterrole exists

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"name":"kubelet-runtime"},"rules":[{"apiGroups":["","extensions","apps","argoproj.io","workflows.argoproj.io","events.argoproj.io","coordination.k8s.io"],"resources":["*"],"verbs":["*"]},{"apiGroups":["batch"],"resources":["jobs","cronjobs"],"verbs":["*"]}]}
  creationTimestamp: "2021-12-16T00:24:56Z"
  name: kubelet-runtime
  resourceVersion: "296716"
  uid: a4697d6e-c786-4ec9-bf3e-88e3dbfdb6d9
rules:
- apiGroups:
  - ""
  - extensions
  - apps
  - argoproj.io
  - workflows.argoproj.io
  - events.argoproj.io
  - coordination.k8s.io
  resources:
  - '*'
  verbs:
  - '*'
- apiGroups:
  - batch
  resources:
  - jobs
  - cronjobs
  verbs:
  - '*'

The sandbox namespace exists

NAME      STATUS   AGE
sandbox   Active   6d6h

My user has authority to operate in the kubelet cluster and the namespace "sandbox"

{
    "apiVersion": "rbac.authorization.k8s.io/v1",
    "kind": "ClusterRoleBinding",
    "metadata": {
        "annotations": {
            "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"rbac.authorization.k8s.io/v1\",\"kind\":\"ClusterRoleBinding\",\"metadata\":{\"annotations\":{},\"name\":\"dma-kubelet-binding\"},\"roleRef\":{\"apiGroup\":\"rbac.authorization.k8s.io\",\"kind\":\"ClusterRole\",\"name\":\"kubelet-runtime\"},\"subjects\":[{\"kind\":\"ServiceAccount\",\"name\":\"dma\",\"namespace\":\"argo\"},{\"kind\":\"ServiceAccount\",\"name\":\"dma\",\"namespace\":\"argo-events\"},{\"kind\":\"ServiceAccount\",\"name\":\"dma\",\"namespace\":\"sandbox\"}]}\n"
        },
        "creationTimestamp": "2021-12-16T00:25:42Z",
        "name": "dma-kubelet-binding",
        "resourceVersion": "371397",
        "uid": "a2fb6d5b-8dba-4320-af74-71caac7bdc39"
    },
    "roleRef": {
        "apiGroup": "rbac.authorization.k8s.io",
        "kind": "ClusterRole",
        "name": "kubelet-runtime"
    },
    "subjects": [
        {
            "kind": "ServiceAccount",
            "name": "dma",
            "namespace": "argo"
        },
        {
            "kind": "ServiceAccount",
            "name": "dma",
            "namespace": "argo-events"
        },
        {
            "kind": "ServiceAccount",
            "name": "dma",
            "namespace": "sandbox"
        }
    ]
}

My user has the correct permissions

{
    "apiVersion": "rbac.authorization.k8s.io/v1",
    "kind": "Role",
    "metadata": {
        "annotations": {
            "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"rbac.authorization.k8s.io/v1\",\"kind\":\"Role\",\"metadata\":{\"annotations\":{},\"name\":\"dma\",\"namespace\":\"sandbox\"},\"rules\":[{\"apiGroups\":[\"\",\"apps\",\"autoscaling\",\"batch\",\"extensions\",\"policy\",\"rbac.authorization.k8s.io\",\"argoproj.io\",\"workflows.argoproj.io\"],\"resources\":[\"pods\",\"configmaps\",\"deployments\",\"events\",\"pods\",\"persistentvolumes\",\"persistentvolumeclaims\",\"services\",\"workflows\"],\"verbs\":[\"get\",\"list\",\"watch\",\"create\",\"update\",\"patch\",\"delete\"]}]}\n"
        },
        "creationTimestamp": "2021-12-21T19:41:38Z",
        "name": "dma",
        "namespace": "sandbox",
        "resourceVersion": "1058387",
        "uid": "94191881-895d-4457-9764-5db9b54cdb3f"
    },
    "rules": [
        {
            "apiGroups": [
                "",
                "apps",
                "autoscaling",
                "batch",
                "extensions",
                "policy",
                "rbac.authorization.k8s.io",
                "argoproj.io",
                "workflows.argoproj.io"
            ],
            "resources": [
                "pods",
                "configmaps",
                "deployments",
                "events",
                "pods",
                "persistentvolumes",
                "persistentvolumeclaims",
                "services",
                "workflows"
            ],
            "verbs": [
                "get",
                "list",
                "watch",
                "create",
                "update",
                "patch",
                "delete"
            ]
        }
    ]
}

My user is configured correctly on all nodes

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://206.81.25.186:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: dma
  name: dma@kubernetes
- context:
    cluster: kubernetes
    user: kubernetes-admin
  name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: dma
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED
- name: kubernetes-admin
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED

Based on this website, I have been searching for a watch event.

I think have rebuilt everything above the control plane but the problem persists.

The next step would be to rebuild the entire cluster, but it would be so much more satisfying to find the actual problem.

Please help.

FIX: So the policy for the sandbox namespace was wrong. I fixed that and the problem is gone!


Solution

  • I think finally understand RBAC (policies and all). Thank you very much to members of the Kubernetes slack channel. These policies have passed the first set of tests for a development environment ("sandbox") for Argo workflows. Still testing.

    policies.yaml file:

    ---
    kind: Role
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: dev
      namespace: sandbox
    rules:
      - apiGroups:
          - "*"
        attributeRestrictions: null
        resources: ["*"]    
        verbs:
          - get
          - watch
          - list
      - apiGroups: ["argoproj.io", "workflows.argoproj.io", "events.argoprpj.io"] 
        attributeRestrictions: null
        resources:
          - pods
          - configmaps
          - deployments
          - events
          - pods
          - persistentvolumes
          - persistentvolumeclaims
          - services
          - workflows
          - eventbus
          - eventsource
          - sensor
        verbs: ["*"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: dma-dev
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: dev
    subjects:
    - kind: User
      name: dma
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: dma-admin
    subjects:
    - kind: User
      name: dma
      namespace: sandbox
    roleRef:
      kind: ClusterRole
      name: cluster-admin
      apiGroup: rbac.authorization.k8s.io
    ---
    kind: NetworkPolicy
    apiVersion: networking.k8s.io/v1
    metadata:
      name: access-nginx
      namespace: sandbox
    spec:
      podSelector:
        matchLabels:
          app: nginx
      ingress:
        - from:
          - podSelector:
              matchLabels:
                run: access
    ...