Search code examples
sqlgoogle-bigqueryaggregate-functions

Need to compute execution time using data in table in BigQuery


I am in need of calculating execution time (difference between timestamp) using data from table in Bigquery. This table contains sample data as below:

-----------------------------------------------------------
|Command Name | Timestamp                        | Status | 
-----------------------------------------------------------
|Command1     | 2022-11-03T19:16:16.767657+05:30 | START  |
-----------------------------------------------------------
|Command1     | 2022-11-03T19:26:16.767657+05:30 | DONE   |
-----------------------------------------------------------
|Command2     | 2022-11-03T19:36:16.767657+05:30 | START  |
-----------------------------------------------------------
|Command2     | 2022-11-03T19:56:16.767657+05:30 | DONE   |
-----------------------------------------------------------

First column contains the command, second column contains timestamp of the execution of the command and last column contains the respective status. With this data, the requirement is to calculate the execution time of each of the command which is defined as:

Execution time of command = Timestamp of command with status as DONE - Timestamp of command with status as START.

This data is stored in the BigQuery. As I am new to the BigQuery, could anyone please help me achieve this.

Thanks and Regards, Prafulla


Solution

  • You might consider below query using a window function,

    WITH sample_table AS (
      SELECT 'Command1' command, TIMESTAMP '2022-11-03T19:16:16.767657+05:30' timestamp, 'START' status UNION ALL
      SELECT 'Command1' command, '2022-11-03T19:26:16.767657+05:30' timestamp, 'DONE' status UNION ALL
      SELECT 'Command2' command, '2022-11-03T19:36:16.767657+05:30' timestamp, 'START' status UNION ALL
      SELECT 'Command2' command, '2022-11-03T19:56:16.767657+05:30' timestamp, 'DONE' status
    )
    SELECT command, TIMESTAMP_DIFF(LEAD(timestamp) OVER w, timestamp, SECOND) execution_time
      FROM sample_table QUALIFY status = 'START'
    WINDOW w AS (PARTITION BY command ORDER BY timestamp);
    

    or using an aggregation function.

    SELECT command, TIMESTAMP_DIFF(MAX(timestamp), MIN(timestamp), SECOND) execution_time
      FROM sample_table
     WHERE status IN ('START', 'DONE')
     GROUP BY 1;
    

    Query results

    enter image description here

    (Updated) to calculate the average execution time.

    WITH sample_table AS (
      -- same as above
    )
    SELECT AVG(execution_time) avg_execution_tme FROM (
      SELECT command, TIMESTAMP_DIFF(LEAD(timestamp) OVER w, timestamp, SECOND) execution_time
        FROM sample_table QUALIFY status = 'START'
      WINDOW w AS (PARTITION BY command ORDER BY timestamp)
    );