We have a ASP .NET core reverse proxy application running in AWS ECS, originally built in .NET Core 3.1 and currently running as .NET6.
AppA (AWS)
---> ProxyB (AWS)
---> AppC (on-prem)
New Relic distributed tracing from our proxy app, into our other internal apps has always worked. But one day we asked why we don't see traces that start in some of our sister organizations given that all of us now belong to the same enterprise NR master account.
We initially thought that perhaps our sister organizations, just didn't have distributed tracing turned on. But that was quickly determined not to be the case.
We initially focused on our proxy app, which looks a lot like the one described here including the following method.
private void CopyFromOriginalRequestContentAndHeaders(HttpContext context, HttpRequestMessage requestMessage)
//snipped
foreach (var header in context.Request.Headers)
{
requestMessage.Content?.Headers.TryAddWithoutValidation(header.Key, header.Value.ToArray());
}
We added some additional logging of the headers of AppA, ProxyB, and AppC. Our logs show the outgoing trace headers, traceparent, tracestate, newrelic
we expected from AppA, being received and output back out as expected by ProxyB.
But what values for the headers AppC was receiving were not what was logged as outbound from ProxyB.
As you can imagine, there's lots of hardware/software between ProxyB and AppC. We didn't know where the headers were being changed.
After opening a support case with New Relic, their engineers were able to tell us
Our engineers have been looking over this data and, well, currently we can't make sense of it. For some reason though AppC is receiving two different (comma-separated) values for each of the three headers, making them invalid. The headers sent by AppB are appended to new values, for example if AppA sends and ProxyB receives and outputs:
"traceparent": "A" "tracestate": "B" "newrelic": "C"
AppC is receiving:
"traceparent": "X, A" "tracestate": "Y, B" "newrelic": "Z, C"
After realizing that X,Y,Z
were valid values, just not valid as part of a list. We decoded them and confirmed that they were being added by ProxyB.
Which helped us realize that the New Relic APM agent was looking at the HttpRequestMessage.Headers and not seeing the trace headers, was generating a new set of them. Which then got combined with HttpRequestMessage.Content.Headers when the data was placed on the wire.
There didn't seem to be an easy and performant build-in way to determine which of the inbound headers should be output at the request level and which at the content level; short of hardcoding a list.
But then we found the YARP: Yet Another Reverse Proxy project and it's code provided what we were looking for.
private void CopyFromOriginalRequestContentAndHeaders(HttpContext context, HttpRequestMessage requestMessage)
//snipped
foreach (var header in context.Request.Headers)
{
if (!requestMessage.Headers.TryAddWithoutValidation(header.Key, header.Value.ToArray())
{
requestMessage.Content?.Headers.TryAddWithoutValidation(header.Key, header.Value.ToArray());
}
}
It's interesting that at the request level, TryAddWithoutValidation() will return false and not add the header. Whereas at the content level, the header gets added regardless of rather or not it's a "content" header as determined by Microsoft.