I run Openstack cinder
with ceph
as its storage backend. when I occasionally tried to delete one of cinder-volume, it failed.
So I turned to use rbd
commands to troubleshoot this issue, below is the error message printed by the command: rbd rm ${pool}/${volume-id}
rbd: error: image still has watchers
This means the image is still open or the client using it crashed. Try again after closing/unmapping it or waiting 30s for the crashed client to timeout.
Then rbd status ${pool}/${volume-id}
shows
Watchers:
watcher=172.18.0.1:0/523356342 client.230016780 cookie=94001004445696
I am confused why the watcher stick on the volume and cause the volume unable to delete, is there any reason or something I did wrong?
And how to delete the volume in this case?
I found a solution to fix this issue, the concept is adding the watcher to the blacklist by using ceph osd blacklist
, then the volume will become removable, after deleting, remove the watcher from the blacklist.
$ ceph osd blacklist add 172.18.0.1:0/523356342
blacklisting 172.18.0.1:0/523356342
$ rbd status ${pool}/${volume-id}
Watchers: none
$ rbd rm ${pool}/${volume-id}
Removing image: 100% complete...done.
$ ceph osd blacklist rm 172.18.0.1:0/523356342
un-blacklisting 172.18.0.1:0/523356342
That's all, but still finding the root cause.