Search code examples
consulnomad

Health-check of a redis job flagged as critical in Nomad


When deploying a Redis job in Nomad (0.6), I do not manage to have it healthy in Consul.

I start Consul in a container and make the port 8500 available on localhost.


$ docker container run --name consul -d -p 8500:8500 consul

When I run nomad, it connects correctly to Consul as we can see in the logs.


$ nomad agent -dev
    No configuration files loaded
==> Starting Nomad agent...
==> Nomad agent configuration:

                Client: true
             Log Level: DEBUG
                Region: global (DC: dc1)
                Server: true
               Version: 0.6.0

==> Nomad agent started! Log data will stream in below:
...
    2017/08/18 15:45:28.373766 [DEBUG] client.consul: bootstrap contacting following Consul DCs: ["dc1"]
    2017/08/18 15:45:28.377703 [INFO] client.consul: discovered following Servers: 127.0.0.1:4647
    2017/08/18 15:45:28.378851 [INFO] client: node registration complete
    2017/08/18 15:45:28.378895 [DEBUG] client: periodically checking for node changes at duration 5s
    2017/08/18 15:45:28.379232 [DEBUG] consul.sync: registered 1 services, 1 checks; deregistered 0 services, 0 checks
...

I then run a redis job with the following configuration file


job "nomad-redis" {
  datacenters = ["dc1"]
  type = "service"

  group "cache" {

    task "redis" {
      driver = "docker"

      config {
        image = "redis:3.2"
        port_map {
          db = 6379
        }
      }

      resources {
        cpu    = 500 # 500 MHz
        memory = 256 # 256MB
        network {
          mbits = 10
          port "db" {}
        }
      }

      service {
        name = "redis"
        port = "db"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

Redis service is added into consul but it appears as critical. Seems the healthcheck cannot be done. From what I understand, checks are done within the task. Is there something I'm missing ?


Solution

  • Running Consul on localhost or in a container attached to the host network (--net=host) fixed the thing.