Search code examples
kubernetesstoragecephrook-storage

How to run Rook-Ceph on one node?


I have a single node development Kubernetes cluster running on bare metal (ubuntu 18.04) and I need to test my application with rook-ceph.

I followed the rook-ceph instructions (https://rook.io/docs/rook/v1.3/ceph-quickstart.html) for installing it on a K8s cluster as seen below and the only thing I changed was instead of installing cluster.yaml I installed cluster-test.yamlat the last step because in the documentation it is mentioned that "cluster-test.yaml: Cluster settings for a test environment such as minikube"

git clone --single-branch --branch release-1.3 https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph
kubectl create -f common.yaml
kubectl create -f operator.yaml
kubectl create -f cluster.yaml

After installing everything the OSD pod cannot get launched:

kubectl -n rook-ceph get pods
NAME                                            READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-pgpp5                          3/3     Running     0          3d
csi-cephfsplugin-provisioner-75f4cb8c76-9xw4m   5/5     Running     2          3d
csi-cephfsplugin-provisioner-75f4cb8c76-lk4h6   5/5     Running     1          3d
csi-rbdplugin-provisioner-6cfb8565c4-5dstt      6/6     Running     2          3d
csi-rbdplugin-provisioner-6cfb8565c4-k7t5r      6/6     Running     0          3d
csi-rbdplugin-rp2rm                             3/3     Running     0          3d
rook-ceph-mgr-a-5b78844689-k66pv                1/1     Running     0          3d
rook-ceph-mon-a-69f75569d9-prtlv                1/1     Running     0          3d
rook-ceph-operator-5698b8bd78-nrgvt             1/1     Running     1          3d
rook-ceph-osd-prepare-odin-wjvps                0/1     Completed   0          36m
rook-discover-n7x5s                             1/1     Running     0          3d

And the reason for that is, it cannot find volume rook-binaries":

kubectl -n rook-ceph describe pod rook-ceph-osd-prepare-odin-wjvps
...
...
Events:
  Type     Reason          Age                    From               Message
  ----     ------          ----                   ----               -------
  Normal   Scheduled       <unknown>              default-scheduler  Successfully assigned rook-ceph/rook-ceph-osd-prepare-odin-wjvps to odin
  Normal   Created         7m26s                  kubelet, odin      Created container copy-bins
  Normal   Started         7m26s                  kubelet, odin      Started container copy-bins
  Normal   Pulled          7m25s                  kubelet, odin      Container image "ceph/ceph:v15" already present on machine
  Normal   Created         7m25s                  kubelet, odin      Created container provision
  Normal   Started         7m25s                  kubelet, odin      Started container provision
  Normal   SandboxChanged  7m22s                  kubelet, odin      Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          7m21s (x2 over 7m26s)  kubelet, odin      Container image "rook/ceph:v1.3.8" already present on machine
  Warning  Failed          7m21s                  kubelet, odin      Error: cannot find volume "rook-binaries" to mount into container "copy-bins"

For some reason the OSD skip my nvme hard drive:

kubectl -n rook-ceph logs rook-ceph-osd-prepare-odin-wjvps -f
2020-07-27 15:20:40.463045 I | rookcmd: starting Rook v1.3.8 with arguments '/rook/rook ceph osd provision'
2020-07-27 15:20:40.463108 I | rookcmd: flag values: --cluster-id=9ebb0292-5238-4234-bbee-565c4de14571, --data-device-filter=all, --data-device-path-filter=, --data-devices=, --encrypted-device=false, --force-format=false, --help=false, --location=, --log-flush-frequency=5s, --log-level=DEBUG, --metadata-device=, --node-name=odin, --operator-image=, --osd-database-size=0, --osd-store=, --osd-wal-size=576, --osds-per-device=1, --pvc-backed-osd=false, --service-account=
2020-07-27 15:20:40.463114 I | op-mon: parsing mon endpoints: a=10.5.98.76:6789
2020-07-27 15:20:40.470018 I | op-osd: CRUSH location=root=default host=odin
2020-07-27 15:20:40.470033 I | cephcmd: crush location of osd: root=default host=odin
2020-07-27 15:20:40.474201 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-07-27 15:20:40.474320 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2020-07-27 15:20:40.474440 D | cephosd: config file @ /etc/ceph/ceph.conf: [global]
fsid                = 4670f13d-e80a-4c20-a161-0079c69953db
mon initial members = a
mon host            = [v2:10.5.98.76:3300,v1:10.5.98.76:6789]
public addr         = 10.4.91.50
cluster addr        = 10.4.91.50

[client.admin]
keyring = /var/lib/rook/rook-ceph/client.admin.keyring

2020-07-27 15:20:40.474450 I | cephosd: discovering hardware
2020-07-27 15:20:40.474458 D | exec: Running command: lsblk --all --noheadings --list --output KNAME
2020-07-27 15:20:40.478069 D | exec: Running command: lsblk /dev/loop0 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.479838 W | inventory: skipping device "loop0" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.479857 D | exec: Running command: lsblk /dev/loop1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.481477 W | inventory: skipping device "loop1" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.481500 D | exec: Running command: lsblk /dev/loop2 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.483081 W | inventory: skipping device "loop2" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.483100 D | exec: Running command: lsblk /dev/loop3 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.484695 W | inventory: skipping device "loop3" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.484712 D | exec: Running command: lsblk /dev/loop4 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.486387 W | inventory: skipping device "loop4" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.486416 D | exec: Running command: lsblk /dev/loop5 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.488108 W | inventory: skipping device "loop5" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.488130 D | exec: Running command: lsblk /dev/loop6 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.490429 W | inventory: skipping device "loop6" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.490460 D | exec: Running command: lsblk /dev/loop7 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.492027 W | inventory: skipping device "loop7" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.492040 D | exec: Running command: lsblk /dev/sda --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.493631 W | inventory: skipping device "sda" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.493645 D | exec: Running command: lsblk /dev/nbd0 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.495394 W | inventory: skipping device "nbd0" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.495411 D | exec: Running command: lsblk /dev/nbd1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.497052 W | inventory: skipping device "nbd1" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.497071 D | exec: Running command: lsblk /dev/nbd2 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.498722 W | inventory: skipping device "nbd2" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.498741 D | exec: Running command: lsblk /dev/nbd3 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.500671 W | inventory: skipping device "nbd3" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.500735 D | exec: Running command: lsblk /dev/nbd4 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.502638 W | inventory: skipping device "nbd4" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.502669 D | exec: Running command: lsblk /dev/nbd5 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.504580 W | inventory: skipping device "nbd5" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.504598 D | exec: Running command: lsblk /dev/nbd6 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.506366 W | inventory: skipping device "nbd6" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.506384 D | exec: Running command: lsblk /dev/nbd7 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.507990 W | inventory: skipping device "nbd7" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.508008 D | exec: Running command: lsblk /dev/nvme0n1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.510929 D | exec: Running command: sgdisk --print /dev/nvme0n1
2020-07-27 15:20:40.513218 D | exec: Running command: udevadm info --query=property /dev/nvme0n1
2020-07-27 15:20:40.520321 D | exec: Running command: lsblk --noheadings --pairs /dev/nvme0n1
2020-07-27 15:20:40.523182 I | inventory: skipping device "nvme0n1" because it has child, considering the child instead.
2020-07-27 15:20:40.523220 D | exec: Running command: lsblk /dev/nvme0n1p1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.526389 D | exec: Running command: udevadm info --query=property /dev/nvme0n1p1
2020-07-27 15:20:40.534070 D | exec: Running command: lsblk /dev/nvme0n1p2 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.536121 D | exec: Running command: udevadm info --query=property /dev/nvme0n1p2
2020-07-27 15:20:40.541434 D | exec: Running command: lsblk /dev/nvme0n1p3 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.543247 D | exec: Running command: udevadm info --query=property /dev/nvme0n1p3
2020-07-27 15:20:40.547962 D | exec: Running command: lsblk /dev/nvme0n1p4 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.549708 D | exec: Running command: udevadm info --query=property /dev/nvme0n1p4
2020-07-27 15:20:40.554266 D | exec: Running command: lsblk /dev/nbd8 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.555923 W | inventory: skipping device "nbd8" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.555937 D | exec: Running command: lsblk /dev/nbd9 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.557480 W | inventory: skipping device "nbd9" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.557493 D | exec: Running command: lsblk /dev/nbd10 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.559103 W | inventory: skipping device "nbd10" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.559118 D | exec: Running command: lsblk /dev/nbd11 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.560668 W | inventory: skipping device "nbd11" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.560682 D | exec: Running command: lsblk /dev/nbd12 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.562237 W | inventory: skipping device "nbd12" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.562251 D | exec: Running command: lsblk /dev/nbd13 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.563800 W | inventory: skipping device "nbd13" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.563814 D | exec: Running command: lsblk /dev/nbd14 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.565334 W | inventory: skipping device "nbd14" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.565349 D | exec: Running command: lsblk /dev/nbd15 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.566990 W | inventory: skipping device "nbd15" because 'lsblk' failed. diskType is empty
2020-07-27 15:20:40.567005 D | inventory: discovered disks are [0xc0001eaa20 0xc0001ebb00 0xc0001e9d40 0xc00016e360]
2020-07-27 15:20:40.567009 I | cephosd: creating and starting the osds
2020-07-27 15:20:40.567029 D | cephosd: desiredDevices are [{Name:all OSDsPerDevice:1 MetadataDevice: DatabaseSizeMB:0 DeviceClass: IsFilter:true IsDevicePathFilter:false}]
2020-07-27 15:20:40.567033 D | cephosd: context.Devices are [0xc0001eaa20 0xc0001ebb00 0xc0001e9d40 0xc00016e360]
2020-07-27 15:20:40.567037 I | cephosd: skipping device "nvme0n1p1" because it contains a filesystem "vfat"
2020-07-27 15:20:40.567042 D | exec: Running command: udevadm info --query=property /dev/nvme0n1p2
2020-07-27 15:20:40.571765 D | exec: Running command: lsblk /dev/nvme0n1p2 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME
2020-07-27 15:20:40.573542 D | exec: Running command: ceph-volume inventory --format json /dev/nvme0n1p2
2020-07-27 15:20:40.997597 I | cephosd: skipping device "nvme0n1p2": ["Insufficient space (<5GB)"].
2020-07-27 15:20:40.997615 I | cephosd: skipping device "nvme0n1p3" because it contains a filesystem "ntfs"
2020-07-27 15:20:40.997620 I | cephosd: skipping device "nvme0n1p4" because it contains a filesystem "ext4"
2020-07-27 15:20:41.001473 I | cephosd: configuring osd devices: {"Entries":{}}
2020-07-27 15:20:41.001492 I | cephosd: no new devices to configure. returning devices already configured with ceph-volume.
2020-07-27 15:20:41.001502 D | exec: Running command: ceph-volume lvm list  --format json
2020-07-27 15:20:41.324600 I | cephosd: 0 ceph-volume lvm osd devices configured on this node
2020-07-27 15:20:41.324618 W | cephosd: skipping OSD configuration as no devices matched the storage settings for this node "odin"

And my harddrive:

sudo df -h -T
Filesystem     Type      Size  Used Avail Use% Mounted on
udev           devtmpfs  7,8G     0  7,8G   0% /dev
tmpfs          tmpfs     1,6G  3,9M  1,6G   1% /run
/dev/nvme0n1p4 ext4      284G  130G  140G  48% /
tmpfs          tmpfs     7,8G  539M  7,3G   7% /dev/shm
tmpfs          tmpfs     5,0M  4,0K  5,0M   1% /run/lock
tmpfs          tmpfs     7,8G     0  7,8G   0% /sys/fs/cgroup
/dev/nvme0n1p1 vfat      508M   31M  478M   6% /boot/efi

How can I solve this? I just need to run it temporarily for my dev (for some reasons I cannot install my application on minikube/microk8s)


Solution

  • The problem was that Ceph needs an unformatted partition or HDD and after I deleted my NTFS partition it started to work.