Search code examples
seaweedfs

Seaweedfs volume management


I have 2 questions concerning a Seaweedfs cluster we have running. The leader is started with the following command:

/usr/local/bin/weed server -ip=192.168.13.154 -ip.bind=192.168.13.154 -dir=/opt/seaweedfs/volume-1,/opt/seaweedfs/volume-2,/opt/seaweedfs/volume-3 -master.dir=/opt/seaweedfs/master -master.peers=192.168.13.154:9333,192.168.13.155:9333,192.168.13.156:9333 -volume.max=30,30,30 -filer=true -s3=true -metrics.address=192.168.13.84:9091

Question 1

I created a master.toml file using weed scaffold -config=master:

[master.maintenance]
# periodically run these scripts are the same as running them from 'weed shell'
scripts = """
  ec.encode -fullPercent=95 -quietFor=1h
  ec.rebuild -force
  ec.balance -force
  volume.balance -force
"""
sleep_minutes = 17          # sleep minutes between each script execution

However the maintenance scripts seem to fail because

shell failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp [::1]:19333: connect: connection refused"

This makes sense since the master is bound to ip 192.168.13.154 and the maintenance script tries to connect to localhost. How can I specificy the master ip in the master.toml file?

Question 2

The amount of volumes seems to grow faster than the amount of disk space used. For example on the .154 server there are only 11 free volumes. But looking at the disk space there should be much more.

Status:

{
  "Topology": {
    "DataCenters": [
      {
        "Free": 16,
        "Id": "DefaultDataCenter",
        "Max": 270,
        "Racks": [
          {
            "DataNodes": [
              {
                "EcShards": 0,
                "Free": 0,
                "Max": 90,
                "PublicUrl": "192.168.13.155:8080",
                "Url": "192.168.13.155:8080",
                "Volumes": 90
              },
              {
                "EcShards": 0,
                "Free": 11,
                "Max": 90,
                "PublicUrl": "192.168.13.154:8080",
                "Url": "192.168.13.154:8080",
                "Volumes": 79
              },
              {
                "EcShards": 0,
                "Free": 5,
                "Max": 90,
                "PublicUrl": "192.168.13.156:8080",
                "Url": "192.168.13.156:8080",
                "Volumes": 85
              }
            ],
            "Free": 16,
            "Id": "DefaultRack",
            "Max": 270
          }
        ]
      }
    ],
    "Free": 16,
    "Max": 270,
    "layouts": [
      ...
    ]
  },
  "Version": "30GB 1.44"
}

Disk (192.168.13.154):

/dev/sdb1      1007G  560G  397G  59% /opt/seaweedfs/volume-1
/dev/sdc1      1007G  542G  414G  57% /opt/seaweedfs/volume-2
/dev/sdd1      1007G  398G  559G  42% /opt/seaweedfs/volume-3

Is this related to the maintenance scripts not running properly, or is there something else I'm not understanding correctly?


Solution

  • Question 1: Added a fix https://github.com/chrislusf/seaweedfs/commit/56244fb9a13c75616aa8a9232c62d1b896906e98

    Question 2: Likely related to master leadership changes.