I'm running a RefineryCMS application on Heroku. It's on two web dynos, yet actual dyno load is under 10% a piece (if I'm reading New Relic correctly) for the current negligible traffic of 13 requests per minute.
However, my app server average response time is in the 800ms and almost all the time is spent in request queueing. Every other request will suffer a 3000ms (or much more) request queue spike, while some others will load up just fine in under 200-300ms thanks to aggressive memcaching.
How can I find out what the root cause of the issue is?
Edit: After further investigation into the log, I had to ask myself the question: is this expected RefineryCMS with asset hosting on Amazon S3 behaviour?
heroku[router]: method=GET path=/system/images/W1siZiIsIjIwMTMvMDQvMTQvMTUvMTAvMDkvNzY4L0pFQU5fQ0xBVURFX1BFTk5FVElFUi5qcGciXSxbInAiLCJ0aHVtYiIsIjI1MHgiXV0/JEAN-CLAUDE%20PENNETIER.jpg service=1061ms status=200 bytes=11232
heroku[router]: method=GET path=/system/images/W1siZiIsIjIwMTMvMDQvMTQvMTUvMTEvMzkvMjI3L01BUklTU19KQU5TT05TLmpwZyJdLFsicCIsInRodW1iIiwiMjUweCJdXQ/MARISS%20JANSONS.jpg service=1161ms status=200 bytes=14358
heroku[router]: method=GET path=/system/images/W1siZiIsIjIwMTMvMDQvMTQvMTUvMjEvNDAvOC9TaXJfTkVWSUxMRV9NQVJSSU5FUi5qcGciXSxbInAiLCJ0aHVtYiIsIjI1MHgiXV0/Sir%20NEVILLE%20MARRINER.jpg service=5412ms status=200 bytes=23191
Are these files being hosted locally and as such weighing down requests?
This sounds like slow running requests. Take a look in your logs for more info, but I suspect you'll see a bunch of H12 errors.
I'd look at using Unicorn and Rack::Timeout if you aren't already. Typically, fast requests are slowed down by a queue caused by other long running requests. By looking at using Rack::Timeout you can kill off any requests that are running for a long time (say 15s), whilst using Unicorn will allow you to serve more than one request at a time.
This is, of course, assuming that you're running on Cedar. If you're not, I would upgrade ASAP.