Search code examples
google-bigquerygoogle-cloud-stackdriver

How to monitor Stackdriver logs for a specific table in Bigquery in real time using python?


I am using a BI tool to view the data in Bigquery. The data is spread across multiple tables so I have several JOINS in the query to get the data in the format I require. Since it took a minute to process all these data and I would like a real-time kind of experience, I have created a scheduled query to run the query which does joins and saves the output to a table. The data in the tables are fed from an ETL tool every 30 minutes (Incremental load). This creates Bigquery Job to load the data.

I have one particular table, after the job for this table is done, I want to run the scheduled query.

I have disabled schedule in scheduled query and made it so it can run only during API call. I have written a python script that sends a API request to scheduled query.

Is there any way in python that monitors logs in real time for a particular Bigquery table so when the job status changes to 'Succeeded' for the particular table, I will send an API request to scheduled query to run the Scheduled query?

I saw the Stackdriver logging python code and it seems like I have to repeatedly make API requests to sort of simulate real-time monitoring. I can't seem to filter the results in API requests either for the particular table, I wrote some scripts to do that for me from the log results.

Is there any library that does it natively?


Solution

  • Found a solution but it uses other Google Services.

    Cloud Logging has a feature called Sink where we can direct the logs to Cloud Pub/Sub topic.

    We can invoke a Cloud Functions from Cloud Pub/Sub.

    Cloud Functions will have the python code to send a API request to Scheduled query.