Search code examples
c#.net-corerabbitmqmasstransitkubernetes-health-check

Implement IHealthCheck in MassTransit for .Net Core 2.2


Is there any idea to implement IHealthCheck in MassTransit. Interface IHealthCheck was introduced in .net Core 2.2 Preview 3 (ASP.NET Core 2.2.0-preview3 now available)


Solution

  • Health checks intend to ensure that the service is healthy by periodically controlling that all required downstream components, like databases, message brokers, cache providers and other services that our service depends on. That's why Microsoft includes health checks like SQL Server or URL call by default.

    MassTransit is not infrastructure, it is a messaging middleware library. What you can be interested there is metrics. There are a few metric collection libraries for MassTransit, like the one included into the main repository, for Application Insight, or those that I made for AppMetrics and Prometheus. Metrics tell you if your consumers are doing okay or not. But, whilst health checks are binary (healthy or not), metrics are relative. Like, if you get the number of errors equal to the number of consumed messages within a few minutes time window - you have a problem. Or, the critical time (which is the difference between the time when a message was sent or published, and the time when it was fully handled) is continuously increasing or gets over a certain threshold. You would need to set up some more comprehensive alerting to detect those situations, based on values that you can tolerate.

    So, if you want health checks - consider checking connectivity to the message broker that your bus uses and for all infrastructure/services that your consumer use. For the message broker, you can implement these checks using a native client.

    There are, however, a couple of things that could be done with MassTransit as well. It is possible to connect an IBusObserver instance to the bus control, and also use an instance of IReceiveEndpointObserver for each receiving endpoint. If the bus fails to start or the endpoint gets a connection error - you can fail the health check. Whilst writing this, I got some ideas and might implement some of those check next week :)