Search code examples
google-cloud-platformmetricsstackdrivergoogle-cloud-stackdriver

Create a log-based metrics that keeps track of the delta between 2 logs


Let's say I have 3 application that works together, and use pub-sub to send messages for the "management". Let's say that there is a "transaction id" created at the start and passed through the applications, and written in logs.

I will have the logs like following:

app1 - transactionIdX - started - timestamp01
app1 - transactionIdX - ended - timestamp02
app2 - transactionIdX - started - timestamp03
app1 - transactionIdY - started - timestamp04
app1 - transactionIdY - ended - timestamp05
app2 - transactionIdX - ended - timestamp06
app3 - transactionIdX - started - timestamp07
app2 - transactionIdY - started - timestamp08
app2 - transactionIdY - ended - timestamp09
app3 - transactionIdX - ended - timestamp10
app3 - transactionIdY - started - timestamp11
app3 - transactionIdY - ended - timestamp12

I would like to have a metrics that exposes this kind of information:

  • transactionIdX - at time timestamp10

    • in app1 - needed (timestamp02-timestamp01) seconds
    • in app2 - needed (timestamp06-timestamp03) seconds
    • in app3 - needed (timestamp10-timestamp07) seconds
    • in total - needed (timestamp10-timestamp01) seconds
  • transactionIdY - at time timestamp12

    • in app1 - needed (timestamp05-timestamp04) seconds
    • in app2 - needed (timestamp09-timestamp08) seconds
    • in app3 - needed (timestamp12-timestamp11) seconds
    • in total - needed (timestamp12-timestamp04) seconds

Is there a way to build a log-based metrics that offers this kind of information?


Solution

  • Currently, this cannot be done by only using the logs-based metrics API. The logs-based metrics pipeline does not maintain state between two log entries, so you cannot capture two values and formulate a query to capture the difference between the two.

    I see two possible workarounds:

    1. Instrument your app to keep stats about transaction latency, and output that in logs and capture it using logs-based metrics.
    2. Perform such computation at query time with scripts using something like Cloud Datalab which integrates with Stackdriver.

    Disclaimer: I am an engineer in Google Stackdriver.