I have an App Engine service (standard environment) sitting behind a load balancer. The load balancer is accessed via a static IP address reserved by Google and the ingress settings of the App engine service have been changed to "Internal + Load Balancing". Therefore, the app can only be accessed via the reserved IP address and not via its URL (https://PROJECT_ID.REGION_ID.r.appspot.com).
On top of this, I have strict Cloud Armor policies (allowing only traffic from one country) targeting the backends of the load balancer as well as selective Firewall rules (allowing only certain IP addresses in the country) at the App Engine level. Future plans include maybe removing the Firewall rules to open traffic to the entire country only.
My goal is to have at least one running instance of the app service scheduled between office hours, e.g. between 8AM and 17PM the instance is used to process incoming requests immediately. Outside these hours, I accept that users will have to wait for instances to start up. I would like to achieve this to minimize the accrual of instance hours and therefore costs. I know there is a free tier but I am expecting to scale up the number of services that need at least one running instance.
I have already tried the following solutions to achieve this goal. However, each solution has a catch that I cannot accept for operational purposes described above:
min_instances: 1
in the app.yaml
configuration file. This does not minimize costs.cron.yaml
to hit a certain relative URL every 15 minutes. This hits the URL of the App Engine service (https://PROJECT_ID.REGION_ID.r.appspot.com) which is closed due to ingress settings. I cannot open the service to all traffic as this would defeat the purpose of having Cloud Armor.What I have not tried:
What I have tried but not entirely convinced:
I find solution 6. to be the most straightforward. However, can I trust Google to not inadvertently start up my instances? What if they change their IP addresses? Am I overlooking this problem and is there a simpler way of solving it?
To start with your Question part:
can I trust Google to not inadvertently start up my instances?
Google Cloud services, including Google Cloud Scheduler, Google Cloud App Engine are designed to perform actions according to their intended functionality and configurations.
In this case, since the target is an app engine standard application, you only have to make sure your scheduler calls the service before it hits its idle_timeout
and not get scale down to 0.
So every 5 min
for example. (idle_timeout
varies based on scaling type, details here)
What if they change their IP addresses?
Yes, they can change. But if you look at how many times it has changes in last couple of years, none.
As per official documentation, it is app owner responsibility to keep an eye. However, they will try to email you for example, if such a change happens. I won't worry much, if its an active project, maintained etc.
Am I overlooking this problem and is there a simpler way of solving it?
Lets see..
So achieving scheduled scaling in Google App Engine Standard Environment with your specific requirements can be a bit tricky.
On your potential solutions list:
Indeed setting min_instances: 1
in the app.yaml
, is not the best if you're looking at cost effective solution.
On cron.yaml
/ App Engine Cron jobs: I understand, the URL of the App Engine service (https://PROJECT_ID.REGION_ID.r.appspot.com) is closed.
I am not sure if you know, that App Engine issues Cron requests from a fixed IP address which is 0.1.0.2
and also add a custom header X-Appengine-Cron: true
, see here , If you can make this work by permitting firewall to allow this IP , then this is most effective solution for your use case.
Maybe this helps for firewall aspect.
As this will give you a flexibility of automatic_scaling
(so after 1700hrs your app can shutdown) and also since App Engine cron jobs are included in the App Engine free tier.
This one is an overkill, and indeed because of the reasons you already mentioned.
To me it looks like, you can very well achieve using Google API Client, but you might run in different issues, overhead and mainly cost, as starting instances programmatically may lead to higher costs if instances are frequently started and stopped not by App Engine, as it consumes resources and potentially incurs additional billing of Google API Client.
I like Cloud Function (Serverless Compute) and also the proposal you made. It will add a function to your project (a very little cost) but compared to all other options this appears to be the 2nd best. I still think option 2 is the 1st best, looking at cost and ease of achieving the goal as metrics.
You have my answers above. :)
I think there are a few approaches more you could consider:
Manual Scaling with dispatch.yaml: You can use a combination of manual scaling and the dispatch.yaml configuration file. You could create a separate version of your app with manual scaling and deploy it using the dispatch.yaml file. This version could be configured to run only during your office hours. However, given all ingress rules with Cloud Armour, adding another service in the landscape could be a difficult exercise.
Task Queues and Pull Queues: You can use Task Queues or Pull Queues to decouple incoming requests from the instance startup process. This way, when an incoming request arrives, instead of directly processing it, you could enqueue it as a task. The task queue could be configured to start instances as needed during your office hours. Once the instance starts, it can start processing the enqueued tasks.
Move to App Engine Flex Environment: In the App Engine Flexible Environment, you have more control over the runtime environment, and you can use Docker containers. This might give you more flexibility in managing instance startup times and scheduling.
Lastly tuning your app: This option bit against the ask in OP. So the idea is to focus on tuning your app, so that your app up time is way faster. There are ways to achieve it, and mind it, it takes time and no of iterations, there are configuration like using which you can fasten your up time, together with enabled warm up requests. This answer on another post has some good details.
<automatic-scaling>
<target-cpu-utilization>0.75</target-cpu-utilization>
<max-pending-latency>4s</max-pending-latency>
</automatic-scaling>
Since the App Engine Standard Environment is designed to scale automatically based on incoming requests. Your specific requirement of having at least one instance running during office hours is a bit counter to this design.
Therefore, you might need to make trade-offs between this requirement and the automatic scaling benefits of the standard environment. The standard environment can scale from zero instances up to thousands very quickly.
It's also worth considering the costs involved in maintaining at least one running instance during office hours versus the potential cost savings from not accruing instance hours outside of these hours.