Search code examples
amazon-web-serviceskuberneteskops

kops 'protectKernelDefaults' flag and 'EventRateLimit' admission plugin not working


I am trying to implement some of the CIS security benchmark advices to kubernetes version 1.21.4 via kOps(1.21.0) for a self hosted Kubernetes on aws.

However when i try protectKernelDefaults:true in kubelet config and EventRateLimit adminssion plugin kube api server config, the k8s cluster fails to come up. I am trying bring up a new cluster with these settings not trying to update any existing ones.

kops cluster yaml that i am trying to use is

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  name: k8s.sample.com
spec:
  cloudLabels:
    team_number: "0"
    environment: "dev"
  api:
    loadBalancer:
      type: Internal
      additionalSecurityGroups:
        - sg-id
    crossZoneLoadBalancing: false
    dns: { }
  authorization:
    rbac: { }
  channel: stable
  cloudProvider: aws
  configBase: s3://state-data/k8s.sample.com
  etcdClusters:
    - cpuRequest: 200m
      etcdMembers:
        - encryptedVolume: true
          instanceGroup: master-eu-west-3a
          name: a
      memoryRequest: 100Mi
      name: main
      env:
        - name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
          value: 2d
        - name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
          value: 1m
        - name: ETCD_LISTEN_METRICS_URLS
          value: http://0.0.0.0:8081
        - name: ETCD_METRICS
          value: basic
    - cpuRequest: 100m
      etcdMembers:
        - encryptedVolume: true
          instanceGroup: master-eu-west-3a
          name: a
      memoryRequest: 100Mi
      name: events
      env:
        - name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
          value: 2d
        - name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
          value: 1m
        - name: ETCD_LISTEN_METRICS_URLS
          value: http://0.0.0.0:8081
        - name: ETCD_METRICS
          value: basic
  iam:
    allowContainerRegistry: true
    legacy: false
  kubeControllerManager:
    enableProfiling: false
    logFormat: json
  kubeScheduler:
    logFormat: json
    enableProfiling: false
  kubelet:
    anonymousAuth: false
    logFormat: json
    protectKernelDefaults: true
    tlsCipherSuites: [ TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256 ]
  kubeAPIServer:
    auditLogMaxAge: 7
    auditLogMaxBackups: 1
    auditLogMaxSize: 25
    auditLogPath: /var/log/kube-apiserver-audit.log
    auditPolicyFile: /srv/kubernetes/audit/policy-config.yaml
    enableProfiling: false
    logFormat: json
    enableAdmissionPlugins:
      - NamespaceLifecycle
      - LimitRanger
      - ServiceAccount
      - PersistentVolumeLabel
      - DefaultStorageClass
      - DefaultTolerationSeconds
      - MutatingAdmissionWebhook
      - ValidatingAdmissionWebhook
      - NodeRestriction
      - ResourceQuota
      - AlwaysPullImages
      - EventRateLimit
      - SecurityContextDeny
  fileAssets:
    - name: audit-policy-config
      path: /srv/kubernetes/audit/policy-config.yaml
      roles:
        - Master
      content: |
        apiVersion: audit.k8s.io/v1
        kind: Policy
        rules:
        - level: Metadata
  kubernetesVersion: 1.21.4
  masterPublicName: api.k8s.sample.com
  networkID: vpc-id
  sshKeyName: node_key
  networking:
    calico:
      crossSubnet: true
  nonMasqueradeCIDR: 100.64.0.0/10
  subnets:
    - id: subnet-id1
      name: sn_nodes_1
      type: Private
      zone: eu-west-3a
    - id: subnet-id2
      name: sn_nodes_2
      type: Private
      zone: eu-west-3a
    - id: subnet-id3
      name: sn_utility_1
      type: Utility
      zone: eu-west-3a
    - id: subnet-id4
      name: sn_utility_2
      type: Utility
      zone: eu-west-3a
  topology:
    dns:
      type: Private
    masters: private
    nodes: private
  additionalPolicies:
    node: |
      [
        {
          "Effect": "Allow",
          "Action": [
            "kms:CreateGrant",
            "kms:Decrypt",
            "kms:DescribeKey",
            "kms:Encrypt",
            "kms:GenerateDataKey*",
            "kms:ReEncrypt*"
          ],
          "Resource": [
            "arn:aws:kms:region:xxxx:key/s3access"
          ]
        }
      ]
    master: |
      [
        {
          "Effect": "Allow",
          "Action": [
            "kms:CreateGrant",
            "kms:Decrypt",
            "kms:DescribeKey",
            "kms:Encrypt",
            "kms:GenerateDataKey*",
            "kms:ReEncrypt*"
          ],
          "Resource": [
            "arn:aws:kms:region:xxxx:key/s3access"
          ]
        }
      ]

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: k8s.sample.com
  name: master-eu-west-3a
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210720
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-eu-west-3a
  role: Master
  subnets:
    - sn_nodes_1
    - sn_nodes_2
  detailedInstanceMonitoring: false
  additionalSecurityGroups:
    - sg-id

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: k8s.sample.com
  name: nodes-eu-west-3a
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210720
  machineType: t3.large
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-eu-west-3a
  role: Node
  subnets:
    - sn_nodes_1
    - sn_nodes_2
  detailedInstanceMonitoring: false
  additionalSecurityGroups:
    - sg-id

** Note: I have made some changes to values above to remove some specific details **

I have tried these protectKernelDefaults & EventRateLimit settings seperately and tried to bring up the cluster. And it doesnt work in those cases as well.

when I try protectKernelDefaults and ssh to master node and check the /var/log directory kube-scheduler.log, kube-proxy.log, kube-controller-manager.log and kube-apiserver.log are empty.

and when it try EventRateLimit and ssh to master node and check the /var/log directory the api server fails to come up and all the other log files has failures stating unable to connect to api server. kube-apiserver.log contains the following

Log file created at: 2021/08/23 05:35:51
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:35:54
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:36:11
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:36:32
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0823 05:36:32.654990       1 flags.go:59] FLAG: --add-dir-header="false"
Log file created at: 2021/08/23 05:37:15
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:38:44
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:41:35
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:46:47
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:51:57
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
Log file created at: 2021/08/23 05:56:59
Running on machine: ip-10-100-120-9
Binary: Built with gc go1.16.7 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg

Any pointers to what is happening would help. Thanks in advance.


Solution

  • The issue with default kernel settings was a bug in kOps. The installed did not set the sysctl settings that kubelet expects.

    The issue with the admission controller is simply a missing admission controller configuration file.