Search code examples
performancenetwork-programmingrediscentosbooksleeve

Redis 2.4 / CentOS 6.2 network throughput dips every 4 minutes... Redis... or client related?


we've encountered some strange performance dips yesterday on our Redis 2.4 / CentOS 6.2 cache servers. They cycle every 4 minutes.

Here's a screenshot from New Relic of the master server: https://www.evernote.com/shard/s368/sh/28312f97-60a9-45ab-a27e-b31abb5c7cce/8fb69edd1206c228fcc444330f1909ec

And here's one of the slave's during the same period: https://www.evernote.com/shard/s368/sh/802b01bc-294d-46a5-adaa-f64e2e8c8bd2/6cbe244d4570fae63ee412cd1de5a841

Some information about our environment: - Cache: 4 linux cloud servers with 8 CPU's, 30GB RAM and 600Mbps internal network bandwith - Web: 30 windows cloud servers with 4 CPU's and 200Mbps internal network bandwith

The webservers don't seem to be very busy, but they do time out when the dips occur. We're not excluding this could be a client issue so, he're some more info on the web application:

Microsoft ASP.Net MVC 3 web application with Redis BookSleeve Client 1.1.0.4 for Data Cache and AngiesList v???? (one that's compatible with this version of BookSleeve) for Session State.

We at first had some problems with the amount of connections to Redis. As I understand Redis 2.4 has a fixed limited amount connected clients.

That's why we've separated the Session and Data Cache in a separate instance of Redis. Unfortunately AngiesList doesn't support more than one connection, so that is connected only to the master server. The BookSleeve client connections are ramdomized using the System.Random from .Net.

readonly Random _randomReadConnection = new Random((int)DateTime.Now.Ticks);

The number of client connections for the data cache is around 200 on all servers. The session cache had about 4100 connections at it's peak.

We have have looked closely at the Redis logs, monitor and run iftop and top, but can't find anything usefull.

So.... why are these dips occuring?

I am a .Net developer and no linux specialist. We don't have any specialists in the Redis/linux field... so we're hoping somebody out here can help us narrow down the search.

As part of a backup plan we are now updating our client to ServiceStack Redis v3 with a compatible session state package and configuring a server with Redis 2.8 just to be sure.

Thanks.


Solution

  • Issues appeared under high loads. The problems disappeared when upgrading our servers to Redis 2.8.

    We also ran into memory issues with Redis 2.4 when we upgraded to Service Stack V3 (link).

    Turning AOF and RDB snapshots off increased performance to a sustainable situation (link).