Search code examples
c#.netazureazure-application-insightssampling

Application insights misunderstanding sampling data


I'm trying to understand the sampling data. And I can't seem to find the exact wording anywhere. Sometimes I just happen to not have all the logs from my application in the traces. I have treated to only log things as needed that's fine. I don't get any unexpected data in the log.

Here is my configuration:

services.AddLogging();

services.Configure<TelemetryConfiguration>(telemetryConfiguration =>
{
    var telemetryProcessorChainBuilder = telemetryConfiguration.DefaultTelemetrySink.TelemetryProcessorChainBuilder;

    telemetryProcessorChainBuilder.UseAdaptiveSampling(maxTelemetryItemsPerSecond: 5, excludedTypes: "Request;Exception");

    telemetryProcessorChainBuilder.Build();
});
services.AddApplicationInsightsTelemetry(new ApplicationInsightsServiceOptions
{
    EnableAdaptiveSampling = false,     
});

In Azure, sampling is set to 100%.

And here is the log traces. Sometimes I get logged correctly twice and sometimes only once. Which is also odd, but it may make sense that it just logs "slower" sometimes.

enter image description here

And here still the log traces at a given second.

enter image description here

My first question. Sampling is the setting for the same messages? Or generally for all messages? I.e. if I get 5 messages with the same wording, I don't drop more, or is it if I get 5 different messages in a second, I don't drop more? Depending on my configuration.

The second more important question "do" my traces disappear because of the sampling setup? Or is it something else? How should sampling be set up if only logs I know about are coming out of the app? Off completely? Or set to a larger number of items?

I've read Microsoft's documentation, which clarified a lot of things for me, but unfortunately it's not completely clear.

I've read articles and discussions, but haven't found much.


Solution

  • Your Settings means:

    • You´ve disabled the default adaptive sampling, because you are using a custom configuration
    • The adaptive sampling here limits telemetry items to a maximum of 5 per second, but it excludes Request and Exception telemetry types from this sampling. This means all Request and Exception telemetry will always be sent, regardless of the volume.
    • So answer to your first question: Sampling is for all messages of specific type (here for all types except: Requests and Exceptions)
    • Answer to your second question: If you sent more items than 5 per second, these items will not be stored or even sent.
    • You can use TelemetryProcessors togehter with adaptive sampling to filter out all other logs, you don´t want to have.

    A sample telemetry processor could look like this (.NET 8.0):

    public class LogOnlyTelemetryProcessor(ITelemetryProcessor next) : ITelemetryProcessor
    {
        public void Process(ITelemetry item)
        {
            // Only allow logs to be sent
            if (item is TraceTelemetry)
            {
                this.Next.Process(item);
            }
    
            // Filter out other telemetry types
        }
    }
    
    // in your program.cs or startup cs.:
    services.Configure<TelemetryConfiguration>((config) =>
    {
        // Add the custom telemetry processor to the processor chain
        var builder = config.DefaultTelemetrySink.TelemetryProcessorChainBuilder;
        builder.Use((next) => new LogOnlyTelemetryProcessor(next));
        builder.Build();
    });