google-cloud-platform deployment scale message-queue video-processing

How to scale a heavy video rendering server?

I'm working on a video rendering server in Node.js. It spawns multiple headless chrome and uses puppeteer library to capture the screenshot and feed them to ffmpeg. Later it concats all the parts with some post processing.

Now I want to move it to production but the it's not performing efficiently. Tried serverless architecture and cloudrun etc but still unable to achieve it. Also they clearly mention that these are not meant for heavy and long running tasks. The video is taking too much to get rendered and even longer than time my laptop takes.

I tried using GCE, the results are satisfactory but now I'm having hard time to scale it. Actually the server can only handle one request at a time efficiently. How to scale horizontally and make sure that each get only one request at a time?

Thanks in advance.

Solution

To scale up number of identical instances you can use Managed Instance Groups. Have a look at the autoscaling documentation to get better understanding how it works but basically it says:

You can autoscale based on one or more of the following metrics that reflect the load of the instance group:

Average CPU utilization.

HTTP load balancing serving capacity, which can be based on either utilization or requests per second.

Cloud Monitoring metrics.

If you will be autoscaling based on the CPI uage the just enable autoscaling and set it up when creating a new group of instances;

Here's an example gcloud command to do this:

gcloud compute instance-groups managed set-autoscaling example-managed-instance-group \
    --max-num-replicas 20 \
    --target-cpu-utilization 0.60 \
    --cool-down-period 90

You can also use any available metric to scale up your group or even create a new custom metric that will trigger scaling up your group;

You can create custom metrics using Cloud Monitoring and write your own monitoring data to the Monitoring service. This gives you side-by-side access to standard Google Cloud data and your custom monitoring data, with a familiar data structure and consistent query syntax. If you have a custom metric, you can choose to scale based on the data from these metrics.

And last - I've found this example use case that scales up group of VM's based on pub/sub queue which might be the solution you're looking for.