Search code examples
cluster-computingmesos

Mesos agent always in Deactivated state


I deployed a Mesos cluster in two Virtual hosts in VMware WorkStation:

  • heron01 ip:192.168.201.131:running Mesos Master, Zookeeper
  • heron02 ip:192.168.201.128: running Mesos Slave

However, the slave is always in Deactivated state. The mesos master ERROR log as following:

Log file created at: 2018/02/18 02:08:35
Running on machine: ubuntu
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0218 02:08:35.859475  5857 authenticator.cpp:513] No credentials provided, authentication requests will be refused
E0218 02:08:40.518481  5859 process.cpp:2577] Failed to shutdown socket with fd 28, address 127.0.0.1:39882: Transport endpoint is not connected
E0218 02:08:40.523883  5859 process.cpp:2577] Failed to shutdown socket with fd 28, address 127.0.0.1:39884: Transport endpoint is not connected
W0218 02:08:40.532027  5854 master.cpp:7557] Master returning resources offered because agent eae9d24b-3cf3-4a0b-9546-dfde4288fbc8-S0 at slave(1)@127.0.1.1:5051 (ubuntu) is disconnected
E0218 02:08:41.131724  5859 process.cpp:2577] Failed to shutdown socket with fd 28, address 127.0.0.1:39886: Transport endpoint is not connected
W0218 02:08:41.135860  5857 master.cpp:7557] Master returning resources offered because agent eae9d24b-3cf3-4a0b-9546-dfde4288fbc8-S1 at slave(1)@127.0.1.1:5051 (ubuntu) is disconnected
E0218 02:08:41.580379  5859 process.cpp:2577] Failed to shutdown socket with fd 28, address 127.0.0.1:39888: Transport endpoint is not connected
E0218 02:08:41.583258  5859 process.cpp:2577] Failed to shutdown socket with fd 28, address 127.0.0.1:39890: Transport endpoint is not connected
W0218 02:08:41.585355  5858 master.cpp:7557] Master returning resources offered because agent eae9d24b-3cf3-4a0b-9546-dfde4288fbc8-S2 at slave(1)@127.0.1.1:5051 (ubuntu) is disconnected
E0218 02:08:48.556628  5859 process.cpp:2577] Failed to shutdown socket with fd 28, address 127.0.0.1:39892: Transport endpoint is not connected
E0218 02:08:48.562399  5859 process.cpp:2577] Failed to shutdown socket with fd 28, address 127.0.0.1:39894: Transport endpoint is not connected
E0218 02:08:48.566049  5859 process.cpp:2577] Failed to shutdown socket with fd 28, address 127.0.0.1:39896: Transport endpoint is not connected
W0218 02:08:48.567793  5853 master.cpp:7557] Master returning resources offered because agent eae9d24b-3cf3-4a0b-9546-dfde4288fbc8-S3 at slave(1)@127.0.1.1:5051 (ubuntu) is disconnected
E0218 02:09:00.063712  5859 process.cpp:2577] Failed to shutdown socket with fd 35, address 127.0.0.1:39914: Transport endpoint is not connected

The Mesos slave WARNING log as following:

Log file created at: 2018/02/17 08:25:51
Running on machine: ubuntu
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0217 08:25:51.034782 48017 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45090: Transport endpoint is not connected
E0217 08:25:51.040766 48017 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45092: Transport endpoint is not connected
W0217 08:25:51.041786 48017 slave.cpp:5010] Master disconnected! Waiting for a new master to be elected
E0217 08:25:51.631784 48019 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45094: Transport endpoint is not connected
W0217 08:25:51.632076 48017 slave.cpp:5010] Master disconnected! Waiting for a new master to be elected
W0217 08:25:52.095075 48011 slave.cpp:5010] Master disconnected! Waiting for a new master to be elected
E0217 08:25:52.095427 48019 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45096: Transport endpoint is not connected
W0217 08:25:59.021628 48012 slave.cpp:5010] Master disconnected! Waiting for a new master to be elected
E0217 08:25:59.022001 48019 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45098: Transport endpoint is not connected
W0217 08:26:10.564131 48016 slave.cpp:5010] Master disconnected! Waiting for a new master to be elected
E0217 08:26:10.564538 48019 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45100: Transport endpoint is not connected
W0217 08:26:12.141916 48012 slave.cpp:5010] Master disconnected! Waiting for a new master to be elected
E0217 08:26:12.142215 48019 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45102: Transport endpoint is not connected
W0217 08:26:39.090140 48018 slave.cpp:5010] Master disconnected! Waiting for a new master to be elected
E0217 08:26:39.090345 48019 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45104: Transport endpoint is not connected
E0217 08:27:38.279918 48019 process.cpp:2577] Failed to shutdown socket with fd 8, address 192.168.201.129:45106: Transport endpoint is not connected


I used the method of modifying the configuration file to configure the cluster environment. The configuration as following.
1. mesos-master-env.sh

export MESOS_log_dir=/home/yitian/mesosdata/log
export MESOS_work_dir=/home/yitian/mesosdata/data
export MESOS_ZK=zk://heron01:2181/mesos
export MESOS_quorum=1

2. mesos-slave-env.sh and mesos-agent-env.sh

export MESOS_master=heron01:5050
export MESOS_log_dir=/home/yitian/mesosdata/log
export MESOS_work_dir=/home/yitian/mesosdata/run

3. masters

heron01

4. slaves

heron02

What's more, the hostname and ip has added to /etc/hosts. Both hosts have the same configuration file.. How can I fix it? Thanks for you help!


Solution

  • I believe you did not set master IP correctly, following is a correct command. If use zk, you also can not use 127.0.0.1, FYI.

    master

    mesos-master --ip=192.168.201.131 --work_dir=/tmp/mesos
    

    agent

    mesos-agent --ip=192.168.201.128 --master=192.168.201.131:5050 --work_dir=/tmp/mesos