Search code examples
kuberneteskubectlrancherk3s

K3S cluster is pending in Rancher dashboard


I have installed a 3 nodes cluster with K3S. Nodes are correctly detected by kubectl and I'm able to deploy images.

$ k3s kubectl get nodes
NAME                   STATUS   ROLES                       AGE     VERSION
master                 Ready    control-plane,etcd,master   4h31m   v1.22.2+k3s1
worker-01              Ready    <none>                      3h59m   v1.22.2+k3s1
worker-02              Ready    <none>                      4h3m    v1.22.2+k3s1

I've also installed Rancher latest version (2.6.0) via docker-compose:

version: '2'
services:
  rancher:
    image: rancher/rancher:latest
    restart: always
    ports:
    - "8080:80/tcp"
    - "4443:443/tcp"
    volumes:
    - "rancher-data:/var/lib/rancher"
    privileged: true
volumes:
  rancher-data:

The dashboard is reachable from every node and I've imported an existing cluster, running the following command:

curl --insecure -sfL https://192.168.1.100:4443/v3/import/66txfzmv4fnw6bqj99lpmdt6jlx4rpwblzhx96wvljc8gczphcn2c2_c-m-nz826pgl.yaml | kubectl apply -f -

The cluster apears as Active but with 0 nodes and with message:

[Pending] waiting for full cluster configuration

The full yaml status is here:

apiVersion: provisioning.cattle.io/v1
kind: Cluster
metadata:
  annotations:
    field.cattle.io/creatorId: user-5bk6w
  creationTimestamp: "2021-10-05T10:06:35Z"
  finalizers:
  - wrangler.cattle.io/provisioning-cluster-remove
  generation: 1
  managedFields:
  - apiVersion: provisioning.cattle.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .: {}
          v:"wrangler.cattle.io/provisioning-cluster-remove": {}
      f:spec: {}
      f:status:
        .: {}
        f:clientSecretName: {}
        f:clusterName: {}
        f:conditions: {}
        f:observedGeneration: {}
        f:ready: {}
    manager: rancher
    operation: Update
    time: "2021-10-05T10:08:30Z"
  name: ofb
  namespace: fleet-default
  resourceVersion: "73357"
  uid: 1d03f05e-77b7-4361-947d-2ef5b50928f5
spec: {}
status:
  clientSecretName: ofb-kubeconfig
  clusterName: c-m-nz826pgl
  conditions:
  - lastUpdateTime: "2021-10-05T10:08:30Z"
    status: "False"
    type: Reconciling
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "False"
    type: Stalled
  - lastUpdateTime: "2021-10-05T14:08:52Z"
    status: "True"
    type: Created
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: RKECluster
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: BackingNamespaceCreated
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: DefaultProjectCreated
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: SystemProjectCreated
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: InitialRolesPopulated
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: CreatorMadeOwner
  - lastUpdateTime: "2021-10-05T10:08:15Z"
    status: "True"
    type: Pending
  - lastUpdateTime: "2021-10-05T10:08:15Z"
    message: waiting for full cluster configuration
    reason: Pending
    status: "True"
    type: Provisioned
  - lastUpdateTime: "2021-10-05T14:08:52Z"
    message: Waiting for API to be available
    status: "True"
    type: Waiting
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: NoDiskPressure
  - lastUpdateTime: "2021-10-05T10:06:35Z"
    status: "True"
    type: NoMemoryPressure
  - lastUpdateTime: "2021-10-05T10:06:39Z"
    status: "False"
    type: Connected
  - lastUpdateTime: "2021-10-05T14:04:52Z"
    status: "True"
    type: Ready
  observedGeneration: 1
  ready: true

Cluster agent shows no particular issue:

$ kubectl -n cattle-system logs -l app=cattle-cluster-agent
time="2021-10-05T13:54:30Z" level=info msg="Connecting to wss://192.168.1.100:4443/v3/connect with token starting with 66txfzmv4fnw6bqj99lpmdt6jlx"
time="2021-10-05T13:54:30Z" level=info msg="Connecting to proxy" url="wss://192.168.1.100:4443/v3/connect"

Is there something I need to do to make the cluster fully running ? I've tried to downgrade the Rancher version to 2.5.0 but I got the same issue.


Solution

  • I believe this is an incompatibility with Kubernetes v1.22.

    Having encountered the same issue with Rancher v2.6.0 when importing a new v1.22.2 cluster (running on IBM Cloud VPC infrastructure), I tailed the logs of the Docker container running Rancher and observed:

    2021/10/20 14:40:31 [INFO] Starting cluster controllers for c-m-cs78tnxc
    E1020 14:40:31.346373      33 reflector.go:139] pkg/mod/github.com/rancher/[email protected]/tools/cache/reflector.go:168: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource (get ingresses.meta.k8s.io)
    

    Kubernetes v1.22 updates NGINX-Ingress to v1.x, which appears to be the cause, and there is an open issue on the Rancher GitHub to update it for compatibility with Kubernetes v1.22.

    In the meantime, after recreating the new cluster using Kubernetes v1.21.5 on the same infrastructure, I was able to import it into Rancher successfully.