Search code examples
c#open-telemetryjaeger

C# distributed tracing with Jaeger / OpenTelemetry, but only the caller should report finally


I have the following scenario:

Service 1 is an ASP.Net service based on the worker template (See https://learn.microsoft.com/en-us/dotnet/core/extensions/workers). This means that there is a simple background service that is added as hosted service (services.AddHostedService<TestService>();) and runs some dummy HTTP requests against service 2.

Example:

public sealed class TestService : BackgroundService
{
    private readonly HttpClient httpClient = new()
    {
        BaseAddress = new Uri("http://localhost:5001")
    };

    public override async Task StartAsync(CancellationToken cancellationToken)
    {
        Log.Logger.Information("Starting test service");
        await base.StartAsync(cancellationToken);
    }

    public override async Task StopAsync(CancellationToken cancellationToken)
    {
        Log.Logger.Information("Stopping test service");
        await base.StopAsync(cancellationToken);
    }

    protected override async Task ExecuteAsync(CancellationToken cancellationToken)
    {
        Log.Logger.Information("Executing test service");
        
        while (!cancellationToken.IsCancellationRequested)
        {
            Log.Logger.Information("Running test service request");
            await this.httpClient.GetAsync("/", cancellationToken);
            await Task.Delay(10000, cancellationToken);
        }
    }
}

Service 1 is configured to do tracing to console and OLTP like this within the ASP.Net pipeline:

services.AddOpenTelemetry()
    .ConfigureResource(resource => resource.AddService(serviceName: "Test1Service"))
    .WithTracing(tracing => 
        tracing
            .AddAspNetCoreInstrumentation()
            .AddConsoleExporter()
            .AddOtlpExporter());

Service 2 is implemented in the same way, but doesn't report to Jaeger directly, e.g. it only sets the activity source like this around the dummy HTTP requests:

private static readonly ActivitySource traceSpan = new("Tracing.Test2");

app.UseEndpoints(endpoints =>
{
    endpoints.MapGet(
        "/",
        async context =>
        {
            using var activity = traceSpan.StartActivity("Request");
            activity?.SetTag("Beer", "Manfred");
            activity?.SetStatus(ActivityStatusCode.Ok);

            var sb = new StringBuilder();
            sb.Append("Hello there!");
            await context.Response.WriteAsync($"{sb}");
        });
});

Jaeger is run locally on Windows using the official binaries (jaeger-1.55.0-windows-amd64) and started with jaeger-all-in-one.exe --collector.otlp.enabled.

What I expected to happen: Start service 2 first, then service 1. Service 1 sends HTTP requests to service 2 all the time (With 10 seconds delay). Service 1 logs to the local Jaeger instance and reports the requests including traces from itself and service 2. Service 2 only adds its trace data to the requests and doesn't report directly (But implicit) to Jaeger.

What happens: Service 1 sends HTTP requests properly to service 2 and waits. However, Jaeger doesn't seem to receive anything at all except some traces for a service called jaeger-all-in-one which seems to be something default? So, basically, no distributed tracing is received... What do I need to check / change here?

Screenshot of Jaeger

Just as reference, I followed the following links:

Process example with two services

Update: I have added AddHttpClientInstrumentation() from OpenTelemetry.Instrumentation.Http and now the traces from service 1 are sent to Jaeger properly. However, I still search for a way to add values from service 2 to the trace without having to add AddOtlpExporter() for service 2 explicitely (See image above) as service 2 is inside a separate network segment where no trace collector can be run.


Solution

  • I have done this differently now:

    Service 1 now gets the 'traces' from a certain header that service 2 sets in a middleware.

    Logic for service 1 (HttpClient side):

    using System.Diagnostics;
    using System.Diagnostics.CodeAnalysis;
    using System.Reflection;
    using System.Text.Json;
    using System.Text.Json.Serialization;
    using ILogger = Serilog.ILogger;
    
    public static async Task<List<TraceContextModel>> GetTracesFromResponse(HttpResponseMessage message, ILogger? logger = null)
    {
        var result = new List<TraceContextModel>();
        var headers = message.Headers.Where(x => x.Key.StartsWith("X-Tracing")).ToList();
    
        foreach (var header in headers)
        {
            foreach (var headerValue in header.Value)
            {
                if (!string.IsNullOrWhiteSpace(headerValue))
                {
                    try
                    {
                        var deserializedTrace = JsonSerializer.Deserialize<TraceContextModel?>(headerValue);
    
                        if (deserializedTrace is not null)
                        {
                            result.Add(deserializedTrace);
                        }
                    }
                    catch (Exception ex)
                    {
                        logger?.Error(ex, "Failed to deserialize trace span");
                    }
                }
            }
        }
    
        return result;
    }
    
    public sealed record class TraceContextModel
    {
        public TraceContextModel()
        {
        }
    
        public TraceContextModel(Activity? activity)
        {
            if (activity is null)
            {
                return;
            }
    
            this.TraceId = activity.TraceId.ToString();
            this.SpanId = activity.SpanId.ToString();
            this.TraceFlags = activity.ActivityTraceFlags.ToString();
            this.ParentSpanId = activity.ParentSpanId.ToString();
            this.ActivitySourceName = activity.Source.Name;
            this.DisplayName = activity.DisplayName;
            this.Kind = activity.Kind.ToString();
            this.StartTime = activity.StartTimeUtc;
            this.Duration = activity.Duration;
            this.Tags = activity.Tags;
        }
    
        [JsonPropertyName("TraceId")]
        public string TraceId { get; init; } = string.Empty;
    
        [JsonPropertyName("SpanId")]
        public string SpanId { get; init; } = string.Empty;
    
        [JsonPropertyName("TraceFlags")]
        public string TraceFlags { get; init; } = string.Empty;
    
        [JsonPropertyName("ParentSpanId")]
        public string ParentSpanId { get; init; } = string.Empty;
    
        [JsonPropertyName("ActivitySourceName")]
        public string ActivitySourceName { get; init; } = string.Empty;
    
        [JsonPropertyName("DisplayName")]
        public string DisplayName { get; init; } = string.Empty;
    
        [JsonPropertyName("Kind")]
        public string Kind { get; init; } = string.Empty;
    
        [JsonPropertyName("StartTime")]
        public DateTime StartTime { get; init; }
    
        [JsonPropertyName("Duration")]
        public TimeSpan Duration { get; init; }
    
        [JsonPropertyName("Tags")]
        public IEnumerable<KeyValuePair<string, string?>> Tags { get; init; } = [];
    }
    

    On service 2 side, this is done with a custom middleware like this:

    using System.Diagnostics;
    using System.Diagnostics.CodeAnalysis;
    using System.Reflection;
    using System.Text.Json;
    using System.Text.Json.Serialization;
    using ILogger = Serilog.ILogger;
    
    public sealed class CustomTracingMiddleware
    {
        private static readonly string serviceName = Assembly.GetEntryAssembly()?.GetName()?.Name ?? string.Empty;
    
        private readonly ActivitySource activitySource = new(serviceName);
    
        private readonly RequestDelegate next;
    
        public CustomTracingMiddleware(RequestDelegate next)
        {
            // Add activity listener. Don't ask why, followed https://github.com/dotnet/runtime/issues/45070.
            var activityListener = new ActivityListener
            {
                ShouldListenTo = s => true,
                SampleUsingParentId = (ref ActivityCreationOptions<string> activityOptions) => ActivitySamplingResult.AllData,
                Sample = (ref ActivityCreationOptions<ActivityContext> activityOptions) => ActivitySamplingResult.AllData
            };
            ActivitySource.AddActivityListener(activityListener);
    
            // Add the next delegate.
            this.next = next;
        }
    
        public async Task Invoke(HttpContext context)
        {
            using var activity = this.activitySource.StartActivity(context.Request.Path);
            var value = JsonSerializer.Serialize(new TraceContextModel(activity));
            context.Response.Headers.Append($"X-Tracing-{serviceName}", value);
            await this.next.Invoke(context);
        }
    }
    

    and used like this in Configure(IApplicationBuilder app, IWebHostEnvironment env):

    app.UseMiddleware<IvuTracingMiddleware>();
    

    The solution works as I wanted, but with 2 downgrades, of course:

    1. It's not really tracing (as we can't write this to Jaeger, etc. but just log it)
    2. The Duration of the traces is always 00:00:00 as the middleware can only inject headers before the next action is done (Otherwise the headers might have been already written and an exception is thrown).