HealthChecks failing after several hours

I've implemented health checks in one of our web apps:

        services.AddHealthChecks()
            .AddSqlServer(connectionString.ConnectionString, null, HealthCheckName); // Sql HealthCheck

What I've noticed is that we're getting this at least once a day, then the app will restart.

An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full. (directaccess-.....****) An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.

Then the app will restart and function again. Did anyone run into this issue before?

Solution

Option -- 1

It appears that application insights was integrated for this app so review Application Insights data to identify why custom exceptions were thrown by application code or why app was taking a long time to load.

Please follow these instructions to view Application Insights data.

Go to Application Insights blade for this App.
Click on View Application Insights Data.

Option -- 2

If the issue is happening right now, collect .NET Profiler trace to troubleshoot the issue. A profiler trace helps you easily identify the ExceptionType, message and callstack for a .NET exception without installing any additional tools and without changing the state of the problem. Profiler trace helps you identify exceptions in both ASP.NET and ASP.NET Core applications.

Please follow these instructions to collect a profiler trace.

Go to App Service Diagnostics
Choose Diagnostic Tools .
Click on Collect .NET Profiler Trace tile and follow the instructions.

(Collect .NET Profiler Trace tile is enabled only for ASP.NET and ASP.NET Core applications. If your app is an ASP.NET app and don't this tile, choose the application stack from the top right)

Option -- 3

If the issue is not reproducible or intermittent, you can configure AutoHealing's custom action to collect some data (like profiler trace or memory dump) that will help you debug the issue further. The triggers and actions allow you to define various conditions based on request count, slow requests, memory limit on which you can take specific actions like restarting the process, logging an event, or starting another executable.

Please follow these instructions to configure an autohealing rule.

Go to App Service Diagnostics
Choose Diagnostic Tools .
Click on Auto Healing tile under Proactive Tools category
Configure a rule based on your scenario.