ruby-on-rails nginx amazon-ec2 elastic-load-balancer

Why my Amz EC2 server downs everyday for few minutes? Errors 503 and 502. It's a Rails app

I don't know which error causes the problem. When I see, the server is down with the error 503. In Google Chrome log, I have the following error:

503 Service Unavailable: Back-end server is at capacity

While the server is down, I can't get to connect via SSH to see the error log. After few minutes the server works and I am go to the nginx error log.

In the log, I have common errors, like:

ActiveRecord::RecordNotFound (Couldn't find Attachment with 'id'=4240)

I know how to solve and I think that this errors is not the problem.

But I have this error too:

Sending 502 response: application did not send a complete response
Process (pid=31880, group=/home/ubuntu/........./current/public) no longer exists! Detaching it from the pool.

I think that it is the problem, but I looked in the internet and the causes and solutions do not appear to solve the problem.

This problem happens after I created a Load Balancer and use HTTPS. Before, this problem never happens.

About my server and app:

Amazon Ec2 instance;
Using Classic Load Balancer (with Amazon Certificate Manager in https port);
Using Route 53;
Don't using Elastic IP;
OS: Ubuntu 14.04.2 LTS
ruby -v: 2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux]
rails -v: Rails 4.2.3
nginx -v: nginx/1.8.0
passenger -v: Phusion Passenger version 5.0.10

Load Balancer Health Check is set up like this:

Ping Target 
HTTP:80/index.html
Timeout 5 seconds
Interval    30 seconds
Unhealthy threshold 5
Healthy threshold   5

Health Check Information:

I get this print in the Load Balancer MONITORING tab. Is the Unhealthy Hosts (Count). Why my host was unhealthy?

Solution

SOLUTION

In my case, the problem was in the assets precompile task. I have a lot of assets in my app and when I did the deploy with capistrano, it exhausts the server.

In other side, sometimes, the assets was precompiled after the deploy, during the page load. But this task is very slowly, and returns the errors 502, 503 and 504.

It causes the servers down to, because the CPU utilization goes to 100%, the average latency is going higher too.

To solve, I removed the assets precompile task from Capistrano. I precompile the assets in my locally PC and send all of them to GIT branch MASTER. When I run cap production deploy, the precompile taks will not run. More details in this post.

I did some changes in my Load Balancer Health Check settings:

Ping Target HTTP:80/elb/index.html (I created in pubic folder this folder and file)
Timeout 5 seconds
Interval    30 seconds
Unhealthy threshold 2
Healthy threshold   10

Idle timeout: 65 seconds (equal my nginx timeout)

With this I hope the task assets precompile never more runs on the server.