Search code examples
apache-sparkmesosdcos

DC/OS stuck at service deployment


Have build a test DC/OS cluster with 3 masters and 2 public agents. Everything looks good. Even deployed application through Marathon with "acceptedResourceRoles":["slave_public"] configuration. However, when I want to deploy the Spark service through the Catalog section, it stucks for a long time and says DCOS has been waiting for resources for xx minutes..

My question is, does the services in the repo can only be deployed in the private agents?

If not the case, how can I deploy them in the publish agents?

Many thanks.


Solution

  • The catalog deploys services on private agents. Public agents are intended only to run load balancers and other tools to control access to services on private agents :) Might want to make some private ones.

    For more info on agent types see these docs

    If you do want your cluster to run workloads on public agents, I think there is some extra configuration you need to set, but I'm not finding it at the moment. It's a bit of an antipattern though.