I am using a BI tool to view the data in Bigquery. The data is spread across multiple tables so I have several JOINS in the query to get the data in the format I require. Since it took a minute to process all these data and I would like a real-time kind of experience, I have created a scheduled query to run the query which does joins and saves the output to a table. The data in the tables are fed from an ETL tool every 30 minutes (Incremental load). This creates Bigquery Job to load the data.
I have one particular table, after the job for this table is done, I want to run the scheduled query.
I have disabled schedule in scheduled query and made it so it can run only during API call. I have written a python script that sends a API request to scheduled query.
Is there any way in python that monitors logs in real time for a particular Bigquery table so when the job status changes to 'Succeeded' for the particular table, I will send an API request to scheduled query to run the Scheduled query?
I saw the Stackdriver logging python code and it seems like I have to repeatedly make API requests to sort of simulate real-time monitoring. I can't seem to filter the results in API requests either for the particular table, I wrote some scripts to do that for me from the log results.
Is there any library that does it natively?
Found a solution but it uses other Google Services.
Cloud Logging
has a feature called Sink
where we can direct the logs to Cloud Pub/Sub
topic.
We can invoke a Cloud Functions
from Cloud Pub/Sub
.
Cloud Functions
will have the python code to send a API request to Scheduled query
.