Search code examples
network-programminggoogle-cloud-platformgoogle-compute-engineload-balancinggoogle-cloud-http-load-balancer

GCP load balancer instance becomens unhealthy after short period of time


I've put my linux apache webserver running on GCP behind the google load balancer. Because i only want https traffic i've redirected port 80 to 443 as shown below:

<VirtualHost *:80>
  ServerName  spawnparty.com
  ServerAlias www.spawnparty.com
  DocumentRoot /var/www/html/wwwroot
  Redirect permanent / https://www.spawnparty.com
</VirtualHost>

i've given the vm an external ip adress to test if the redirect works and it does.

I've then configured the load balancer. i've made it so that the frondend accepts both http and https. for the backend i made 2 services:

one that uses http and one that uses https so that if somoeone enters though http it is forwarded and then redirected to https by the code shown above.

for both backend services is made a basic health check:

for http: port: 80, timeout: 5s, check interval: 5s, unhealthy threshold: 2 attempts

for https: port: 443, timeout: 5s, check interval: 5s, unhealthy threshold: 2 attempts

the https one works fine and states 1 of 1 instance healthy but the http health check states 0 of 1 instance healthy

if change the health check from http to https and back again for the http back end service it works for a short period of time but after a few minutes it states 0 of 1 instance healthy again.

What must i change to keep it healthy?


Solution

  • TL;DR - Use the same HTTPS health check for both the backend services.

    Health Checking and response codes

    You will need to respond with 200 response code and close the connection normally within the configured period.

    HTTP and HTTPS health checks

    If traffic from the load balancer to your instances uses the HTTP or HTTPS protocol, then HTTP or HTTPS health checks verify that the instance is healthy and the web server is up and serving traffic.

    For an HTTP(S) health check probe to be deemed successful, the instance must return a valid HTTP response with code 200 and close the connection normally within the configured period. If it does this a specified number of times in a row, the health check returns a status of HEALTHY for that instance. If an instance fails a specified number of health check probes in a row, it is marked UNHEALTHY without any notification being sent. UNHEALTHY instances do not receive new connections, but existing connections are allowed to continue. UNHEALTHY instances continue to receive health check probes. If an instance later passes a health check by successfully responding to a specified number of consecutive health check probes, it is marked HEALTHY and starts receiving new connections, again without any notification.

    Since you have 2 separate backend services (one for HTTP and other for HTTPS), you will need 2 health checks (although backend services allows reusing the same health check too if needed - read on) since the load balancer considers them independent services.

    As you have already confirmed, using the HTTPS health check will work with the HTTPS based service, but using the HTTP health check will not. The reason is you are actually returning a HTTP 301 response code for permanent URL redirection instead of the expected HTTP 200 response code.

    Possible Solution(s)

    One way to fix this is to use HTTPS health checks for both the backend services, since your underlying service is still the same. You lose the ability to health check the redirection, but that unfortunately is not supported by the Google Cloud Load Balancer. You can share the same HTTPS health check resource too for both the backend services.

    The solution posted by CharlesB will also work, but I feel you're adding additional redirection rules just to satisfy health checks and are not used on your service path anyway. You will also need a separate HTTP health check resource. Using just HTTPS health checks for both the backend services I feel is much simpler and also verifies that your service is alive to handle new requests.