I'm facing some trouble trying to run mesos-dns dockerized on a mesos cluster.
I've setup 2 virtual machines with ubuntu trusty on a windows 8.1 host. My VMs are called docker-vm and docker-sl-vm; where the first one runs mesos-master and the 2nd one runs mesos-slave.
The VMs have 2 network cards; one running NAT for accesing internet through the host and the other one is a Host-only adapter for internal communication.
The IPs for the VMs are:
The MESOS cluster is running Okay.
I am trying to follow this tutorial. So, I am running mesos-dns with the following marathon description:
{
"args": [
"/mesos-dns",
"-config=/config.json"
],
"container": {
"docker": {
"image": "mesosphere/mesos-dns",
"network": "HOST"
},
"type": "DOCKER",
"volumes": [
{
"containerPath": "/config.json",
"hostPath": "/usr/local/mesos-dns/config.json",
"mode": "RO"
}
]
},
"cpus": 0.5,
"mem": 256,
"id": "mesos-dns",
"instances": 1,
"constraints": [["hostname", "CLUSTER", "docker-sl-vm"]]
}
and this config.json:
{
"zk": "zk://192.168.56.101:2181/mesos",
"refreshSeconds": 60,
"ttl": 60,
"domain": "mesos",
"port": 53,
"resolvers": ["8.8.8.8"],
"timeout": 5,
"email": "root.mesos-dns.mesos"
}
I am also running a test proposal application called peek with the following description:
{
"id": "peek",
"cmd": "env >env.txt && python3 -m http.server 8080",
"cpus": 0.5,
"mem": 32.0,
"container": {
"type": "DOCKER",
"docker": {
"image": "python:3",
"network": "BRIDGE",
"portMappings": [
{ "containerPort": 8080, "hostPort": 0 }
]
}
}
}
PROBLEM
Into the tutorial, a dig command such as dig _peek._tcp.marathon.mesos SRV
got the following answer:
; <<>> DiG 9.9.5-3ubuntu0.5-Ubuntu <<>> _peek._tcp.marathon.mesos SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57329
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; QUESTION SECTION:
;_peek._tcp.marathon.mesos. IN SRV
;; ANSWER SECTION:
_peek._tcp.marathon.mesos. 60 IN SRV 0 0 31000 peek-27346-s0.marathon.mesos.
;; ADDITIONAL SECTION:
peek-27346-s0.marathon.mesos. 60 IN A 10.141.141.10
;; Query time: 4 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sat Oct 24 23:21:15 UTC 2015
;; MSG SIZE rcvd: 160
Where we can clearly see the port and IP bound to _peek._tcp.marathon.mesos SRV
, BUT when I run this on my slave machine - which is running this container - I get this result:
docker@docker-sl-vm:~$ dig _peek._tcp.marathon.mesos SRV
; <<>> DiG 9.9.5-3ubuntu0.5-Ubuntu <<>> _peek._tcp.marathon.mesos SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 33415
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;_peek._tcp.marathon.mesos. IN SRV
;; AUTHORITY SECTION:
. 10791 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2015102801 1800 900 604800 241
;; Query time: 1 msec
;; SERVER: 10.10.11.1#53(10.10.11.1)
;; WHEN: Wed Oct 28 17:06:30 BRT 2015
;; MSG SIZE rcvd: 129
It looks like mesos-dns can't resolve _peek._tcp.marathon.mesos SRV.
Does anyone know why and how to fix it?
Thank you in advance...
UPDATE
Result of command /etc/resolv.conf
:
nameserver 10.10.11.1
nameserver 10.10.10.7
Have a look at the Mesos DNS docs regarding Slave Setup:
To allow Mesos tasks to use Mesos-DNS as the primary DNS server, you must edit the file
/etc/resolv.conf
in every slave and add a new nameserver. For instance, if mesos-dns runs on the server with IP address 10.181.64.13, you should add the line nameserver 10.181.64.13 at the beginning of/etc/resolv.conf
on every slave node.
I think the local IP (192.168.56.102
) address is missing in your /etc/resolv.conf
.
Otherwise, you can also try my minimal Mesos DNS image, but you'd still have to edit the above file.