Search code examples
dockerkubernetesmesos

Kubernetes on Mesos, no suitable offer available


I followed the instructions on this page to build and deploy Mesos. I did this on a Ubuntu Trusty VM with 1 Mesos master and 1 slave. The following commands are what I used to run Mesos.

$ mesos-master --ip=10.0.2.15 --work_dir=/var/lib/mesos --log_dir=/var/log/mesos
$ mesos-slave --master=10.0.2.15:5050 --containerizers=docker,mesos 

All of three tests finished without error message.

Then I followed this page to deploy Kubernetes. After building Kubernetes-Mesos, I used the following commands to deploy Kubernetes.

$ export KUBERNETES_MASTER_IP=10.0.2.15
$ export KUBERNETES_MASTER=http://${KUBERNETES_MASTER_IP}:8888
$ docker run -d --hostname $(uname -n) --name etcd \
  -p 4001:4001 -p 7001:7001 quay.io/coreos/etcd:v2.0.12 \
  --listen-client-urls http://0.0.0.0:4001 \
  --advertise-client-urls http://${KUBERNETES_MASTER_IP}:4001

etcd container is running.

$ export PATH="$(pwd)/_output/local/go/bin:$PATH"
$ export MESOS_MASTER=10.0.2.15:5050
$ cat <<EOF >mesos-cloud.conf
  [mesos-cloud]
        mesos-master        = ${MESOS_MASTER}
  EOF
$ km apiserver \
  --address=${KUBERNETES_MASTER_IP} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --service-cluster-ip-range=10.10.10.0/24 \
  --port=8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf \
  --secure-port=0 \
  --v=1 >apiserver.log 2>&1 &
$ km controller-manager \
  --master=${KUBERNETES_MASTER_IP}:8888 \
  --cloud-provider=mesos \
  --cloud-config=./mesos-cloud.conf  \
  --v=1 >controller.log 2>&1 &
$ km scheduler \
  --address=${KUBERNETES_MASTER_IP} \
  --mesos-master=${MESOS_MASTER} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --mesos-user=root \
  --api-servers=${KUBERNETES_MASTER_IP}:8888 \
  --cluster-dns=10.10.10.10 \
  --cluster-domain=cluster.local \
  --v=2 >scheduler.log 2>&1 &

Logs seem correct, no error message.

kubectl get services shows:

NAME             CLUSTER-IP    EXTERNAL-IP   PORT(S)     AGE
k8sm-scheduler   10.10.10.50   <none>        10251/TCP   1m
kubernetes       10.10.10.1    <none>        443/TCP     2m

Then I created a simple nginx pod, kubectl get pods always shows it's pending. kubectl get events shows:

FIRSTSEEN   LASTSEEN   COUNT     NAME      KIND      SUBOBJECT   TYPE      REASON             SOURCE                 MESSAGE
2m          47s        9         nginx     Pod                   Warning   FailedScheduling   {default-scheduler }   Error scheduling: No suitable offers for pod/task

What does it mean by No suitable offers for pod/task? In Mesos' log, I see Mesos keeps sending offer to Kubernetes framework, but keeps being DECLINED. If I run mesos-execute --master=10.0.2.15:5050 --name=echo --command="echo 'hello world'" --containerizer=docker --docker_image=ubuntu:14.04 it can deploy a Docker image with "mesos-" prefix and run the command. So it seems Docker containerizer works properly.


Solution

  • Kubernetes-Mesos will decline offers for several reasons:

    1. the resources in the offer don't satisfy the minimum required to launch the pod-task. The first pod-task launched on a given slave requires executor resources in addition to the pod-task resources.
    2. the resources in the offer aren't compatible with the scheduler. this happens if you start the framework, launch a task, kill the scheduler process, then restart the scheduler with different flags; some scheduler flags affect the command-line used to launch the executor. the quickest way to remedy this is to delete any running pods and manually kill the incompatible executor process(es) already running on the slave(s).
    3. there is a problem with the node info in the apiserver registry.

    What version of k8sm are you running? master? You might try increasing the verbosity of the scheduler logs (--v=3) and then dumping a copy of your scheduler logs up on pastebin or some such so that they can be analyzed. It's often difficult to troubleshoot situations like this without the logs.