Search code examples
c#redisstackexchange.redisazure-redis-cache

It was not possible to connect to the redis server(s); ConnectTimeout


I'm using Azure Function V1 with StackExchange.Redis 1.2.6. Function receiving 1000s of messages per minutes, For every message, For every device, I'm checking Redis. I noticed When we have more messages at that time we are getting below an error.

Exception while executing function: TSFEventRoutingFunction No connection is available to service this operation: HGET GEO_DYNAMIC_hash; It was not possible to connect to the redis server(s); ConnectTimeout; IOCP: (Busy=1,Free=999,Min=24,Max=1000), WORKER: (Busy=47,Free=32720,Min=24,Max=32767), Local-CPU: n/a It was not possible to connect to the redis server(s); ConnectTimeout

CacheService as recommended by MS

public class CacheService : ICacheService
{
    private readonly IDatabase cache;
    private static readonly string connectionString = ConfigurationManager.AppSettings["RedisConnection"];

    public CacheService()
    {
        this.cache = Connection.GetDatabase();
    }

    private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
    {
        return ConnectionMultiplexer.Connect(connectionString);
    });

    public static ConnectionMultiplexer Connection
    {
        get
        {
            return lazyConnection.Value;
        }
    }

    public async Task<string> GetAsync(string hashKey, string ruleKey)
    {
        return await this.cache.HashGetAsync(hashKey, ruleKey);
    }
}

I'm injecting ICacheService in Azure function and calling GetAsync Method on every request.

Using Azure Redis Instance C3

enter image description here

Currently, you can see I have a single connection, Creating multiple connections will help to solve this issue? or Any other suggestion to solve/understand this issue.


Solution

  • There are many different causes of the error you are getting. Here are some I can think of off the top of my head (not in any particular order):

    1. Your connectTimeout is too small. I often see customers set a small connect timeout often because they think it will ensure that the connection is established within that time span. The problem with this approach is that when something goes wrong (high client CPU, high server CPU, etc), then the connection attempt will fail. This often makes a bad situation worse - instead of helping, it aggravates the problem by forcing the system to restart the process of trying to reconnect, often resulting in a connect -> fail -> retry loop. I generally recommend that you leave your connectionTimeout at 15 seconds or higher. It is better to let your connection attempt succeed after 15 or 20 seconds than it is to have it fail after 5 seconds repeatedly, resulting in an outage lasting several minutes until the system finally recovers.

    2. A server-side failover occurs. A connection is severed by the server as a result of some type of failover from master to replica. This can happen if the server-side software is updated at the Redis layer, the OS layer or the hosting layer.

    3. A networking infrastructure failure of some type (hardware sitting between the client and the server sees some type of issue).

    4. You change the access password for your Redis instance. Changing the password will reset connections to all clients to force them to re-authenticate.

    5. Thread Pool Settings need to be adjusted. If your thread pool settings are not adjusted correctly for your workload, then you can run into delays in spinning up new threads as explained here.

    I have written a bunch of best practices for Redis that will help you avoid other problems as well.