We're sending a copy of the the data generated by Application Insights to an Event Hub, using the standard Sinks extensibility in the client SDK. We follow the same batch and compression logic as with the default sink - instead simply sending the data to an event hub endpoint.
In a function app that is receiving the data, a single EventHub message will therefore contain a JSON stream with a number of telemetry points, compressed using gzip.
We need to deserialize the stream and take a number of actions based on the telemetry type. We'll be receiving about 50k per second, so performance is important.
I've noticed that the SDK is using Bond and has defined the public schema - https://github.com/Microsoft/ApplicationInsights-aspnetcore/tree/develop/Schema/PublicSchema
I'm currently doing something like,
foreach (var eventHubMessage in messages)
{
// decompress the entire gzipped payload
var decompressedData = DeserializeCompressedStream(eventHubMessage.Body.Array);
// deframe the JSON stream into individual items, (e.g. data.Split(new[] { Environment.NewLine })
var payloadItems = decompressedData.Deframe();
foreach (var item in payloadItems){
// A standard JSON.NET conversion to get the item
Envelope telemetryItem = ItemConverter.CreateTelemetryFromPayloadItem(item);
// etc etc
}
}
This works, but the conversion at the item level using JSON.Net is an expensive operation at this scale and maxing out CPU.
Assuming the application doing the deserialization has access to the types, e.g https://github.com/Microsoft/ApplicationInsights-aspnetcore/tree/develop/test/ApplicationInsightsTypes, what would be the recommended & most efficient way to deserialize the JSON stream using the Bond definitions?
Unfortunately you cannot de-serialize the entire envelope due to an issue in Lazy Deserialization: https://github.com/Microsoft/bond/issues/96.
So you need to parse out the baseData
somehow else and then pass it to bond de-serializer. Or perhaps just parse it as JSON using some JSON parsers like we do in unit tests.
JsonReader reader = new JsonTextReader(new StringReader(Encoding.UTF8.GetString(b, 0, b.Length)));
reader.DateParseHandling = DateParseHandling.None;
JObject obj = JObject.Load(reader);
return obj.ToObject<AI.TelemetryItem<TelemetryDataType>>();
I cannot comment on a most efficient way to do it as I'm not sure what's your task is. In some cases the most performant way will be to not de-serialize the entire payload at all.