Search code examples
marathondcos

Are service addresses available to the dc/os host OS?


I’m trying to have my dc/os 1.8 docker containers send log messages to a logstash that is also running in dc/os by using the service address of the logstash service.

that doesn’t appear to work as docker throws an error: logstash.marathon.l4lb.thisdcos.directory: no such host

are service addresses not exposed to the host systems (or do I need to configure something for this)?

on dc/os 1.7 I used a fixed host port in my logstash config and logstash.marathon.mesos as host, but these .marathon.mesos hostnames seem to not exist in 1.8 anymore.

the service addresses work fine when I try to use them from within a container (for example to link my prometheus service to my alertmanager service). but from the host level they don’t exist.

EDIT:

my statement about the missing marathon.mesos urls was wrong. they do work, but I uses the wrong one. for now this fixes my problem kind of. I configured logging using this host and a fixed container port.

for everybody trying the same thing: you have to configure the fixed host port everytime you make changes to the service config in the ui via the json mode. the fixed host port config is no longer available in the network tab of the ui, so the dc/os ui will DELETE the host port config on every load.

still no idea why the l4lb urls don't work.

EDIT2

still no idea, but i figured out that minuteman generates crash and error logs every other second:

/opt/mesosphere/active/minuteman/minuteman/error.log:

CRASH REPORT Process <0.25809.2> with 0 neighbours exited with reason: {timeout,{gen_server,call,[{lashup_kv,'[email protected]'},{start_kv_sync_fsm,'[email protected]',<0.25809.2>}]}} in gen_server:call/2 line 204

/opt/mesosphere/active/minuteman/minuteman/log/crash.log

2016-10-12 13:16:49 =CRASH REPORT====
  crasher:
    initial call: lashup_kv_sync_tx_fsm:init/1
    pid: <0.29002.2>
    registered_name: []
    exception exit: {{timeout,{gen_server,call,[{lashup_kv,'[email protected]'},{start_kv_sync_fsm,'[email protected]',<0.29002.2>}]}},[{gen_server,call,2,[{file,"gen_server.erl"},{line,204}]},{lashup_kv_sync_tx_fsm,init,1,[{file,"/pkg/src/minuteman/_build/default/lib/lashup/src/lashup_kv_sync_tx_fsm.erl"},{line,23}]},{gen_statem,init_it,6,[{file,"gen_statem.erl"},{line,554}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}
    ancestors: [lashup_kv_aae_sup,lashup_kv_sup,lashup_platform_sup,lashup_sup,<0.916.0>]
    messages: []
    links: [<0.992.0>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 610
    stack_size: 27
    reductions: 127
  neighbours:

the dc/os ui claims spartan and minuteman are healthy, but while the crash.log of the dns dispatcher is empty the l4lb gets new crashes every other second.


Solution

  • my problem was twofold:

    1. the l4b did not properly run, that was only fixed after a total reinstall of the cluster

    2. the l4b only supports TCP traffic. because i wanted to use it to send container-logs to logstash using udp (docker-gelf only supports UDP) this failed