Search code examples
kubernetesglusterfs

Trying to get rid of orphan volumes in heketi results in error without reason


I'm trying to get rid of a bunch of orphaned volumes in heketi. When I try, I get "Error" and then information about the volume I just tried to delete, serialized as JSON. There's nothing else. I've tried to dig into the logs but they don't reveal anything.

This is the command I used to try and delete the volume:

heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret ${KEY}  volume delete 22f1a960651f0f16ada20a15d68c7dd6
Error: {"size":30,"name":"vol_22f1a960651f0f16ada20a15d68c7dd6","durability":{"type":"none","replicate":{},"disperse":{}},"gid":2008,"glustervolumeoptions":["","cluster.post-op-delay-secs 0"," performance.client-io-threads off"," performance.open-behind off"," performance.readdir-ahead off"," performance.read-ahead off"," performance.stat-prefetch off"," performance.write-behind off"," performance.io-cache off"," cluster.consistent-metadata on"," performance.quick-read off"," performance.strict-o-direct on"," storage.health-check-interval 0",""],"snapshot":{"enable":true,"factor":1},"id":"22f1a960651f0f16ada20a15d68c7dd6","cluster":"e924a50aa93d9eae1132c60eb1f36310","mount":{"glusterfs":{"hosts":["<SECRET>"],"device":"<SECRET>:vol_22f1a960651f0f16ada20a15d68c7dd6","options":{"backup-volfile-servers":""}}},"blockinfo":{},"bricks":[{"id":"0f4c6d7f605e9368bfe3dc7cc117b69a","path":"/var/lib/heketi/mounts/vg_970f0faf60f8dfc6f6a0d6bd25bdea7c/brick_0f4c6d7f605e9368bfe3dc7cc117b69a/brick","device":"970f0faf60f8dfc6f6a0d6bd25bdea7c","node":"107894a855c9d2c34509b18272e6c298","volume":"22f1a960651f0f16ada20a15d68c7dd6","size":31457280}]}

Notice that the second line only contains Error, then the info about the volume serialized as json.

The volume doesn't exist in gluster. I used the below commands to verify the volume was no longer there:

kubectl -n default  exec -t -i glusterfs-rgz9g bash
gluster volume info 
<shows volume i did not delete>

Kubernetes does not show a PersistentVolumeClaim or PersistentVolume:

kubectl get pvc -A
No resources found.
kubectl get pv -A
No resources found.

I tried looking at the heketi logs, but it only reports a GET for the volume

kubectl -n default logs   heketi-56f678775c-nrbwd
[negroni] 2019-11-25T21:29:19Z | 200 |   1.407715ms | <SECRET>:8080 | GET /volumes/22f1a960651f0f16ada20a15d68c7dd6
[negroni] 2019-11-25T21:29:19Z | 200 |   1.111984ms | <SECRET>:8080 | GET /volumes/22f1a960651f0f16ada20a15d68c7dd6
[negroni] 2019-11-25T21:29:19Z | 200 |   1.540357ms | <SECRET>:8080 | GET /volumes/22f1a960651f0f16ada20a15d68c7dd6 

I've tried setting more verbose log level but the setting doesn't stick:

heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret ${KEY}  loglevel set debug
Server log level updated
heketi-cli -s $HEKETI_CLI_SERVER --user admin --secret ${KEY}  loglevel get
info

My CLI uses

heketi-cli -v
heketi-cli v9.0.0

And the Heketi server is running:

kubectl -n default  exec -t -i  heketi-56f678775c-nrbwd bash
heketi -v
Heketi v9.0.0-124-gc2e2a4ab

Based on the logs, I believe heketi-cli has an issue and then never actually sends the POST or DELETE request to the heketi server.

How do I proceed to debug this? At this point my only work around is to recreate my cluster but I'd like to avoid that especially if something like this comes back.


Solution

  • Looks like there's a bug in heketi-cli because if I manually craft the request using ruby and curl, I'm able to delete the volume:

    TOKEN=$(ruby makeToken.rb DELETE /volumes/22f1a960651f0f16ada20a15d68c7dd6)
    curl -X DELETE -H "Authorization: Bearer $TOKEN" http://10.233.21.178:8080/volumes/22f1a960651f0f16ada20a15d68c7dd6
    

    Please see https://github.com/heketi/heketi/blob/master/docs/api/api.md#authentication-model for how to generate the jwt token.

    I manually created the request, expected getting a better error message that the command line tool swallowed. Turns out the cli was actually busted.

    ruby code for making the jwt token (makeToken.rb). You need to fill in pass and server.

    #!/usr/bin/env ruby
    
    require 'jwt'
    require 'digest'
    
    user = "admin"
    pass = "<SECRET>"
    server = "http://localhost:8080/"
    
    method = "#{ARGV[0]}"
    uri = "#{ARGV[1]}"
    
    payload = {}
    
    headers = {
      iss: 'admin',
      iat: Time.now.to_i,
      exp: Time.now.to_i + 600,
      qsh: Digest::SHA256.hexdigest("#{method}&#{uri}")
    }
    
    token = JWT.encode headers, pass, 'HS256'
    print("#{token}")