Search code examples
.net-6.0open-telemetrydatadog

Datadog's Error tracking not working. Error attributes are not been exported or parsed correctly


I have a .net6 api project. I'm using open telemetry (otel) to instrument it and send traces to datadog. I use this image otel/opentelemetry-collector-contrib:latest to receive telemetry from my application and send to datadog.

I configured an ExceptionMiddleware to capture all exceptions and create span attributes

public async Task InvokeAsync(HttpContext context)
    {
        string userId = GetUserId(context);
        try
        {
            await this._next(context);
        }
        catch (BusinessException bex)
        {
            LogWarning(bex);
            UpdateSpan(bex, userId);
            await this.HandleBadRequest(bex.Message, bex.StackTrace, context);
        }
        catch (Exception ex)
        {
            LogError(ex);
            string errorId = UpdateSpan(ex, userId);
    
            await this.HandleInternalServerError($"ErrorId: {errorId}", ex.StackTrace, context);
        }
    }

    private static string UpdateSpan(Exception ex, string userId)
    {
        Activity activity = Activity.Current;
        if (activity == null)
            return _defaultActivityId;
    
        string errorId = activity.Id.ToString();
        activity?.SetStatus(ActivityStatusCode.Error, ex.Source);
    
        activity?.SetTag("exception.message", ex.Message);
        activity?.SetTag("exception.stacktrace", ex.StackTrace);
        activity?.SetTag("exception.type", ex.Source);
    
        return errorId;
    }

    private void LogWarning(BusinessException bex)
    {
        Dictionary<string, object> errorDict = BuildExceptionAtributes(bex);
    
        using (_logger.BeginScope(errorDict))
        {
            _logger.LogWarning(bex.Message);
    
        };
    }
    
    private void LogError(Exception ex)
    {
        Dictionary<string, object> errorDict = BuildExceptionAtributes(ex);
    
        using (_logger.BeginScope(errorDict))
        {
            _logger.LogError(ex.Message);
        };
    }
    
    
    private static Dictionary<string, object> BuildExceptionAtributes(Exception ex)
    {
        var errorDict = new Dictionary<string, object>()
        {
            // otel
            ["exception.message"] = ex.Message,
            ["exception.stacktrace"] = ex.StackTrace,
            ["exception.type"] = ex.Source,
        };
        return errorDict;
    }

I created a otel processor to transform exception attributes into error attributes, according to this datadog documentation default standard attributes

Exception attributes are exported correctly, but error attributes are not.

Example:

  • exception.type becomes error.message,
  • exception.stacktrace becomes error.stack (ok)
  • exception.message is ignored
  • error.type disappear
{
    "exception": {
        "message": "Object reference not set to an instance of an object.",
        "stacktrace": "at blablabla.GetConnection(Int64 userId, Nullable`1 version) in blablabla.cs:line 36",
        "type": "blablabla.WebApi"
    },
    "error": {
        "message": "blablabla.WebApi",
        "stack": "at blablabla.GetConnection(Int64 userId, Nullable`1 version) in blablabla.cs:line 36",
        "type": "does-not-show"
    }
}

This is my otel-colector config.yaml

# This is a configuration file for the OpenTelemetry Collector intended to be
# used in conjunction with the OTLP Exporter example (see ../TestOtlpExporter.cs)
#
# For more information about the OpenTelemetry Collector see:
#   https://github.com/open-telemetry/opentelemetry-collector
#
receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  logging:
    verbosity: detailed
  datadog:
    api:
      key: some-key
      # site: 

processors:
  # https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/batchprocessor/README.md
  batch:
  attributes/insert:
    actions:
      - key: "error.message"
        action: "insert"
        from_attribute: "exception.message"
  attributes/insert2:
    actions:
      - key: "error.stack"
        action: "insert"
        from_attribute: "exception.stacktrace"
  attributes/insert3:
    actions:
      - key: "error.type"
        action: "insert"
        from_attribute: "exception.type"
      

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, attributes/insert, attributes/insert2, attributes/insert3]
      exporters: [logging, datadog]
    metrics:
      receivers: [otlp]
      exporters: [logging, datadog]
    logs:
      receivers: [otlp]
      exporters: [logging, datadog]

===

I tried to send the error attributes directly, without the otel-colector's processor, and I got exactly the same problem as described above.

private static string UpdateSpan(Exception ex, string userId)
{
    Activity activity = Activity.Current;
    if (activity == null)
        return _defaultActivityId;

    string errorId = activity.Id.ToString();
    activity?.SetStatus(ActivityStatusCode.Error, ex.Source);
    activity?.SetTag("kinvo.user.id", userId);

    activity?.SetTag("exception.message", ex.Message);
    activity?.SetTag("exception.stacktrace", ex.StackTrace);
    activity?.SetTag("exception.type", ex.Source);

    // sending directly
    activity?.SetTag("error.message", ex.Message);
    activity?.SetTag("error.stack", ex.StackTrace);
    activity?.SetTag("error.type", ex.Source);

    return errorId;
}

Solution

  • I found the problem. It's in this line of code

    activity?.SetStatus(ActivityStatusCode.Error, ex.Source);
    

    When I use the second parameter of SetStatus it overrides the error.message when exported to Datadog.

    So I just need to use it like this:

    activity?.SetStatus(ActivityStatusCode.Error);