amazon-web-services amazon-ec2 amazon-ecs

Confusion about instances used inside a Amazon Ec2 Container Service

When a Ec2 Container Engine cluster is created, it creates a Compute Engine managed instance group to manage the created instances. These instances are from Ec2 service, which means, they are Virtual machines.

But we know that containers represent a new way to deploy containers based on operating-system-level virtualization rather than hardware virtualization like VMs that are heavyweight and non-portable, isn't a contradiction? correct me if I'm wrong.

We use containers because they are extremely fast (either in boot time or tasks execution) compared to VMs, and they save a lot of space storage. So if we have one node(vm) that can supports 4 containers max, our clients can rapidly lunch 4 containers, but beyond this number, Ec2 autoscaler will need to lunch a new node(vm) to support upcoming containers, which incurs some tasks delay.

Is it impossible to launch containers over physical machines?

And what do you recommend for running critical time execution tasks?

Solution

I believe you are working under an erroneous assumption that ECS scales the virtual machines ("container instances" -- the instances where containers will run) directly with task demand.

If that were true, you would have a point, because the cluster would be sluggish and unresponsive any time insufficient container instance resources were not immediately available.

ECS doesn't do that, the presence of the Auto Scaling Group notwithstanding.

Depending on the Amazon EC2 instance types you use in your clusters, and quantity of container instances you have in a cluster, your tasks have a limited amount of resources that they can use when they are run. ECS monitors the resources available in the cluster to work with the schedulers to place tasks. If your cluster runs low on any of these resources, such as memory, you will eventually be unable to launch more tasks until you add more container instances, reduce the number of desired tasks in a service, or stop some of the running tasks in your cluster to free up the constrained resource. (emphasis added)

http://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch_alarm_autoscaling.html

So, no... it doesn't launch the new tasks slowly when you are out of capacity. It doesn't launch them at all.

But don't get ahead of me.

The link above explains, with examples, how scaling of the virtual machines (container instances) is designed to actually work.

Of course, you don't have to make them adaptively scalable at all. You can go with your physical server model (note: I say physical server model -- meaning a fixed, inelastic pool of resources, on always-running virtual machines, since virtual machines is what EC2 provides), and just choose how many instances you wait to have running at all times, essentially emulating physical servers. If you wanted, say, 8 container instances, the "auto scaling group" would maintain exactly 8 at all times, creating replacements if, say, one of them experienced a hardware failure. That "auto" accomplishment would be maintaining the status quo. And, of course, in this configuration, you could manually reconfigure from 8 to, say, 12 and the "auto" accomplishment would be that you'd automatically get 4 new ones to add to the existing 8.

But the idea of how the service is ideally used is that your group of virtual machines scales up and down by rules you devise, to anticipate the resources needed by future tasks -- or a future lack of tasks.

In the example given, memory reservation is the trigger:

When the memory reservation of your cluster rises above 75% (meaning that only 25% of the memory in your cluster is available to for new tasks to reserve), the alarm triggers the Auto Scaling group to add another instance and provide more resources for your tasks and services.

It triggers the addition of more container instances so that you always have whatever you have determined to be the appropriate threshold of surplus capacity already online by the time you need it.

Of course, memory is just one resource, and 75% is just an arbitrary threshold chosen for the example.

Auto Scaling Groups can scale on a variety of triggers -- the phrase of the moon, the price trends in the stock market, whatever is appropriate to anticipating your desired amount of surplus capacity and can be quantified and monitored can be used... but this service does not scale itself directly by the actual attempt to launch a new task when the task can't be launched due to insufficient resources.

Herein lies the flaw in your original argument.

Why virtual machines? Simply enough, because when you destroy a virtual machine because the capacity is not expected to be needed, you stop paying for it.

In this light, perhaps you'll agree that this is not a weakness, it's a strength. Physical servers never stop costing you when you are not using them.

You don't need to pay anything at all for capacity you will not be needing with VMs -- you only have to pay for the capacity you're using plus the amount you need to keep immediately available to handle anticipated demand.

You can have as much idle surplus immediately ready as you are willing to pay for, or you can maximize savings by allowing as little surplus capacity as you are comfortable with being able to access without delay.