Search code examples
azuremicroservicesazure-application-insights

Azure Microservices Performance Insights - Collective Performance Counter Reporting


I have around 10 microservice applications in .net, all hosted on Azure ServiceFabric. These applications are setup in a sequence for example

API call to Application 1 > stores data in cosmos > sends message to Application 2
Application 2 > Depending on data and business logic send a message to relative department (application 3, 4, 5, etc)
Application 3 processes and stores the data in database

I want a performance metric which shows some start/end time or total time taken to perform 1 End to End cycle for a payload.

I have come through certain solutions for tihs

  1. Log metrics in Application Insights before and after method calls Example: Create and use a unique Guid as correlationId

    Application 1 > Method1() - Record Start Time Application 1 > Method() - Record start and end time Application 3 > Method2() - Record start and end time
    Application 3 > Method2() - Record End Time

This is available in Insights when I search for that Guid Even here I have a question how could I improve the visibility of this, maybe charts, reports, what options I could use in Application Insights?

  1. Log as above but in a separate database, this way we have control on data (application insights have huge data and cant be a separate API) Create a new API with input as Guid, the response will be seomthing like below Total EndToEnd Time: 10seconds Application1> Method2(): 2 seconds ...

I know there could be better options but need some direction on this please.


Solution

  • There are two options to do it with Application Insights. Both are not ideal at this point.

    Option I. If you store all telemetry in the same resource and your app doesn't have too much load then you can group by (summarize) by CorrelationId. Here is an idea (you might want to extend it by recording start time when it comes to Application 1 and end time when it comes to Application 3):

    let Source = datatable(RoleName:string, CorrelationId:string, Timestamp:datetime)
    [
        'Application 1', '1', '2021-04-22T05:00:45.237Z',
        'Application 2', '1', '2021-04-22T05:01:45.237Z',
        'Application 3', '1', '2021-04-22T05:02:45.237Z',
        'Application 1', '2', '2021-04-22T05:00:45.237Z',
        'Application 2', '2', '2021-04-22T05:01:46.237Z',
        'Application 3', '2', '2021-04-22T05:02:47.237Z',
    ];
    Source
    | summarize min_timestamp=min(Timestamp), max_timestamp=max(Timestamp) by CorrelationId
    | extend duration = max_timestamp - min_timestamp
    | project CorrelationId, duration
    

    Option II. Application Insights supports W3C standard of Distributed Tracing for HTTPS calls. If you manually propagate distributed tracing context through your message (between applications) and restore this context, then you can do the following:

    1. In Application 1 you can put start time in a Baggage
    2. This field will get propagated across applications [note, OperationId will also propagate]
    3. In Application N you will know exactly when a particular request/transaction started, so you will be able to emit proper metric