Search code examples
mysqlkubernetesazure-aksazure-filesstorage-class-specifier

MySQL database in Azure cluster using Azure Files as PV won't start


I have an Azure kubernetes cluster, but because of the limitation of attached default volumes per node (8 at my node size), I had to find a different solution to provision volumes.
The solution was to use Azure files volume and I followed this article https://learn.microsoft.com/en-us/azure/aks/azure-files-volume#mount-options which works, I have a volume mounted.

But the problem is with the MySQL instance, it just won't start.

For the test purpose, I created a deployment with 2 simple DB containers, one of which is using the default storage class volume and the second one is using the Azure-files.

Here is my manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-db
  labels:
    prj: test-db
spec:
  selector:
    matchLabels:
      prj: test-db
  template:
    metadata:
      labels:
        prj: test-db
    spec:
      containers:
        - name: db-default
          image: mysql:5.7.37
          imagePullPolicy: IfNotPresent
          args:
            - "--ignore-db-dir=lost+found"
          ports:
            - containerPort: 3306
              name: mysql
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: password
          volumeMounts:
            - name: default-pv
              mountPath: /var/lib/mysql
              subPath: test

        - name: db-azurefiles
          image: mysql:5.7.37
          imagePullPolicy: IfNotPresent
          args:
            - "--ignore-db-dir=lost+found"
            - "--initialize-insecure"
          ports:
            - containerPort: 3306
              name: mysql
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: password
          volumeMounts:
            - name: azurefile-pv
              mountPath: /var/lib/mysql
              subPath: test
      volumes:
        - name: default-pv
          persistentVolumeClaim:
            claimName: default-pvc
        - name: azurefile-pv
          persistentVolumeClaim:
            claimName: azurefile-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: default-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: default
  resources:
    requests:
      storage: 200Mi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: azurefile-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: azure-file-store
  resources:
    requests:
      storage: 200Mi
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict
- nosharesock
parameters:
  skuName: Standard_LRS
provisioner: file.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

The one with default PV works without any problem, but the second one with Azure-files throws this error:

[Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.37-1debian10 started.
[Note] [Entrypoint]: Switching to dedicated user 'mysql'
[Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.37-1debian10 started.
[Note] [Entrypoint]: Initializing database files
[Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
[Warning] InnoDB: New log files created, LSN=45790
[Warning] InnoDB: Creating foreign key constraint system tables.
[Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: e86bdae0-979b-11ec-abbf-f66bf9455d85.
[Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
mysqld: Can't change permissions of the file 'ca-key.pem' (Errcode: 1 - Operation not permitted)
[ERROR] Could not set file permission for ca-key.pem
[ERROR] Aborting

Based on the error, it seems like the database can't write to the volume mount, but that's not (entirely) true. I mounted both of those volumes to another container to be able to inspect files, here is the output and we can see that database was able to write files on the volume:

-rwxrwxrwx 1 root root       56 Feb 27 07:07 auto.cnf
-rwxrwxrwx 1 root root     1680 Feb 27 07:07 ca-key.pem
-rwxrwxrwx 1 root root      215 Feb 27 07:07 ib_buffer_pool
-rwxrwxrwx 1 root root 50331648 Feb 27 07:07 ib_logfile0
-rwxrwxrwx 1 root root 50331648 Feb 27 07:07 ib_logfile1
-rwxrwxrwx 1 root root 12582912 Feb 27 07:07 ibdata1

Obviously, some files are missing, but this output disproved my thought that the Mysql can't write to the folder.

My guess is, that the MySQL can't properly work with the file system used on Azure files.

What I tried:

  • different versions of MySQL (5.7.16, 5.7.24, 5.7.31, 5.7.37) and MariaDB (10.6)
  • testing different arguments for mysql
  • recreate the storage with NFS v3 enabled
  • create a custom Mysql image with added cifs-utils
  • testing different permissions, gid/uid, and other attributes of the container and also storage class

Solution

  • It appears to be the permissions of volumes mounted this way that is causing the issue.

    If we modify your storage class to match the uid/gid of the mysql user, the pod can start:

    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    mountOptions:
    - dir_mode=0777
    - file_mode=0777
    - uid=999
    - gid=999
    - mfsymlinks
    - cache=strict
    - nosharesock
    parameters:
      skuName: Standard_LRS
    provisioner: file.csi.azure.com
    reclaimPolicy: Delete
    volumeBindingMode: Immediate
    

    The mount options permanently set the owner of the files contained in the mount, which doesn't work well for anything that wants to own the files it creates. Because things are created 777, anyone can read/write to the directories just not own them.