We are running a PHP stack on our app servers which use twemproxy locally (via socket), to connect to multiple upstream memcached servers (EC2 small instances) for our caching layer.
Every so often I get an alert from our app monitor that a page load time takes > 5 seconds. When this occurs, the immediate fix is to restart the twemproxy service on each app server, which is a hassle.
The only fix I have now is a crontab that runs every minute and restarts the service, but as you can imagine nothing gets written for a few seconds every minute, which is not a desired, permanent solution.
Has anyone encountered this before? If so, what was the fix? I tried to switch to AWS Elasticache but it didn't have the same performance as our current twemproxy solution.
Here is my twemproxy config.
auto_eject_hosts: true
distribution: ketama
hash: fnv1a_64
listen: /var/run/nutcracker/nutcracker.sock 0666
server_failure_limit: 1
server_retry_timeout: 600000 # 600sec, 10m
timeout: 100
- vcache-1:11211:1
- vcache-2:11211:1
And here is the connection config for the php layer:
# Note: We are using HA / twemproxy (nutcracker) / memcached proxy
# So this isn't a default memcache(d) port
# Each webapp will host the cache proxy, which allows us to connect via socket
# which should be faster, as no tcp overhead
# Hash has been manually override from default jenkins to FNV1A_64, which directly aligns with proxy
port: 0
<?php echo Hobis_Api_Cache::TYPE_VOLATILE; ?>:
- <?php echo Memcached::OPT_HASH; ?>: <?php echo Memcached::HASH_FNV1A_64; ?><?php echo PHP_EOL; ?>
- <?php echo Memcached::OPT_SERIALIZER; ?>: <?php echo Memcached::SERIALIZER_IGBINARY; ?><?php echo PHP_EOL; ?>
- /var/run/nutcracker/nutcracker.sock
We are running 0.4.1 twemproxy and 1.4.25 memcached.
I ended up switching from unix socket to tcp port on localhost and it seems to have resolved the restart problem. However I did notice an uptick in response time in making the switch, due to the overhead associated with tcp. Not accepting this answer in hopes someone down the road will post a more authoritative answer about the sockets...