Search code examples
google-cloud-platformautoscalinggoogle-cloud-composer

Scaling GCP Cloud Composer environment size


I am using Cloud Composer 2.

I understand the purpose of scaling the number of workers, their compute and their memory. I also understand why I would want additional schedulers (mostly resiliency, I'd say). But I don't understand why I would need to bump up the web server CPU/Memory/Storage or the scheduler CPU/Memory/Storage. The documentation explains, but doesn't give any real world examples when it would matter.

The scheduler simply needs enough power to be able to launch jobs, which is trivial, and the web server is only facing ourselves internal to our company, so if it is slow or laggy, that's OK.

Also, what about the ENVIRONMENT_SIZE? I see this is for the airflow DB, but again, why would this DB ever need to be of any significant size? It holds so little in it.


Does anyone have a use case for when they needed a more powerful scheduler or web server, or a DB environment anything other than SMALL? I'm curious what I should be on the lookout for and what the argument would be to bump this up.


Solution

  • As mentioned in the documentation1 and documentation2 , each environment size has different resources and uses. The Environment size controls the performance parameters of the managed Cloud Composer infrastructure that includes the Airflow database. You can consider selecting a larger environment size if you want to run a large number of DAGs and tasks.

    If you define a small environment-size it should create an environment with resources 0.5 vCPUs, 2 GB memory, 1 GB storage, for a medium environment-size the resources will be 2 vCPUs, 7.5 GB memory, 5 GB storage and for a large environment-size the resources will be 4 vCPUs, 15 GB memory, 10 GB storage

    Following are the different features of the Workers with different environment sizes.

    Small : Autoscaling between 1 and 3 workers, with 0.5 vCPU, 2 GB memory, 1 GB storage each

    Medium : Autoscaling between 2 and 6 workers, with 2 vCPU, 7.5 GB memory, 5 GB storage each

    Large : Autoscaling between 3 and 12 workers, with 4 vCPU, 15 GB memory, 10 GB storage each

    There is also a custom setting, where you can set all the options as per your requirements. This documentation explains how the autoscaling works and it’s mentioned that we can set the minimum and maximum no.of workers as per the requirement so that the workers will autoscale within the limits. Note that, we cannot set the min no.of workers as zero (0), it should be either 1 or more than 1.

    If you want a feature where you can scale down the worker to zero(0) when they are idle then you can raise a Feature Request in the PIT.