Search code examples
mesospheredcos

DC/OS: Service vs Marathon App


I have the following two questions:

1) Are DC/OS Services just marathon apps? (Or: What's the difference between a DC/OS Service (like Cassandra) and a Cassandra app installed via Marathon?)

2) Scaling: Do DC/OS services like Cassandra automatically scale to all available nodes in the cluster (given sufficient work load)?

Thank you for your help :)


Solution

  • 1) In order to answer the first part of your question, let me add one another concept: DC/OS package, so we have DC/OS package vs DC/OS service vs marathon app.

    a) DC/OS service vs Marathon App They are the same, a long run service which will be automatically kept running by marathon. You see this for example when creating a new DC/OS service, which you can do with an marathon app definition.

    b) DC/OS packages (and I believe the core of your question) dcos package install cassandra will deploy the DC/OS Apache Cassandra package. The interesting piece of the Cassandra package is the scheduler, a piece of software that manages your Cassandra cluster (e.g., by bootstrapping the cluster or automatically restarting failed tasks) and also provides endpoints for scaling, upgrading,... If you want it is the automated version of an administrator for your Cassandra cluster.

    Now we also have to ensure that this admistrator is always available (i.e., what happens if the administrator/scheduler task/node is failing?). This is why the scheduler is deployed by Marathon so it will be automatically restarted.

    Marathon | Cassandra Scheduler | Cassandra Cluster

    2) Second part of your question: Autoscaling

    The package provides endpoints for scaling, so the typical pattern is to provide a script (e.g., marathon-autoscale) to scale the cassandra cluster. The reason why you need your own script is that scaling is something very individual to every user, and especialy scaling down. Keep in mind that you are scaling a persistent service, so how to you select the node you want to remove? Do you first drain traffic from that node? Do you migrate data to other nodes?