I have been using Google CLoud for a few weeks now and I am facing a big problem for my limited GCP knowledge.
I have a python project whos goal is to "scrape" datas from a website using it's API. My project run a few tens of thousands of requests during executions and it can take very long (few hours, maybe more)
I have 4 python scripts in my project and it's all orchestrated by a bash script.
The execution is as follow :
Now I want to get ride of that bash script and I would like to automatize execution of thos scripts approx. once a week.
The problem here is the execution time. Here is what I already tested :
Google App Engine : The timeout of a request on GAE is limited to 10 minutes, and my functions can run for few hours. GAE is not usable here.
Google Compute Engine : My scripts will run max. 10-15 hours a week, keeping a compute engine up during all that time would be too pricey.
What could I do to automatize the execution of my scripts in a cloud environment ? What could be solutions I didn't though about, without changing my code ?
Thank you
A simple way to accomplish this without the need to get rid of the existing bash script that orchestrates everything would be:
command.With that, your instance will start on a schedule, it will run the startup script (that will be your existing orchestrating script), and it will shut down once it's finished.