Search code examples
vagrantvirtualboxmesosdcos

Error while installing DCOS "vagrant up m1 a1 boot"


I am very to new to DCOS. i am fallowing the instructions given in https://github.com/dcos/dcos-vagrant/blob/master/docs/deploy.md In the installation i am getting error while running

vagrant up m1 a1 boot

I am using

 vagrant version 1.8.4 
 Virtual box version 5.0.20
 windows 7 , 8G RAM

And the error

<pre>==> m1:       Sep 28 08:48:37 m1.dcos systemd[1]: Starting Navstar: A distribute
d systems & network overlay orchestration engine...
==> m1:       Sep 28 08:48:37 m1.dcos check-time[12001]: Time is marked as bad
==> m1:       Sep 28 08:48:37 m1.dcos systemd[1]: dcos-navstar.service: control
process exited, code=exited status=1
==> m1:       Sep 28 08:48:37 m1.dcos systemd[1]: Failed to start Navstar: A dis
tributed systems & network overlay orchestration engine.
==> m1:       Sep 28 08:48:37 m1.dcos systemd[1]: Unit dcos-navstar.service ente
red failed state.
==> m1:       Sep 28 08:48:37 m1.dcos systemd[1]: dcos-navstar.service failed.
==> m1:
==> m1:       time="2016-09-28T08:48:37-07:00" level=fatal msg="Found unhealthy
systemd units"
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

dcos-postflight

Stdout from the command:



Stderr from the command:

DC/OS Unhealthy\ntime="2016-09-28T08:48:37-07:00" level=info msg="/opt/mesospher
e/etc/endpoints_config.json not found"
time="2016-09-28T08:48:37-07:00" level=info msg="Command /opt/mesosphere/bin/det
ect_ip executed successfully, PID 12011"
[dcos-metronome.service]: DC/OS Metronome unit dcos-metronome.service has never
entered `active` state
-- Logs begin at Wed 2016-09-28 08:22:53 PDT, end at Wed 2016-09-28 08:48:37 PDT
. --
Sep 28 08:46:22 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:46:22 m1.dcos systemd[1]: dcos-metronome.service failed.
Sep 28 08:46:37 m1.dcos systemd[1]: dcos-metronome.service holdoff time over, sc
heduling restart.
Sep 28 08:46:37 m1.dcos systemd[1]: Starting Jobs Service: DC/OS Metronome...
Sep 28 08:46:37 m1.dcos systemd[1]: dcos-metronome.service: control process exit
ed, code=exited status=1
Sep 28 08:46:37 m1.dcos systemd[1]: Failed to start Jobs Service: DC/OS Metronom
e.
Sep 28 08:46:37 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:46:37 m1.dcos systemd[1]: dcos-metronome.service failed.
Sep 28 08:46:52 m1.dcos systemd[1]: dcos-metronome.service holdoff time over, sc
heduling restart.
Sep 28 08:46:52 m1.dcos systemd[1]: Starting Jobs Service: DC/OS Metronome...
Sep 28 08:46:52 m1.dcos systemd[1]: dcos-metronome.service: control process exit
ed, code=exited status=1
Sep 28 08:46:52 m1.dcos systemd[1]: Failed to start Jobs Service: DC/OS Metronom
e.
Sep 28 08:46:52 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:46:52 m1.dcos systemd[1]: dcos-metronome.service failed.
Sep 28 08:47:07 m1.dcos systemd[1]: dcos-metronome.service holdoff time over, sc
heduling restart.
Sep 28 08:47:07 m1.dcos systemd[1]: Starting Jobs Service: DC/OS Metronome...
Sep 28 08:47:07 m1.dcos systemd[1]: dcos-metronome.service: control process exit
ed, code=exited status=1
Sep 28 08:47:07 m1.dcos systemd[1]: Failed to start Jobs Service: DC/OS Metronom
e.
Sep 28 08:47:07 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:47:07 m1.dcos systemd[1]: dcos-metronome.service failed.
Sep 28 08:47:23 m1.dcos systemd[1]: dcos-metronome.service holdoff time over, sc
heduling restart.
Sep 28 08:47:23 m1.dcos systemd[1]: Starting Jobs Service: DC/OS Metronome...
Sep 28 08:47:23 m1.dcos systemd[1]: dcos-metronome.service: control process exit
ed, code=exited status=1
Sep 28 08:47:23 m1.dcos systemd[1]: Failed to start Jobs Service: DC/OS Metronom
e.
Sep 28 08:47:23 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:47:23 m1.dcos systemd[1]: dcos-metronome.service failed.
Sep 28 08:47:38 m1.dcos systemd[1]: dcos-metronome.service holdoff time over, sc
heduling restart.
Sep 28 08:47:38 m1.dcos systemd[1]: Starting Jobs Service: DC/OS Metronome...
Sep 28 08:47:38 m1.dcos systemd[1]: dcos-metronome.service: control process exit
ed, code=exited status=1
Sep 28 08:47:38 m1.dcos systemd[1]: Failed to start Jobs Service: DC/OS Metronom
e.
Sep 28 08:47:38 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:47:38 m1.dcos systemd[1]: dcos-metronome.service failed.
Sep 28 08:47:53 m1.dcos systemd[1]: dcos-metronome.service holdoff time over, sc
heduling restart.
Sep 28 08:47:53 m1.dcos systemd[1]: Starting Jobs Service: DC/OS Metronome...
Sep 28 08:47:53 m1.dcos systemd[1]: dcos-metronome.service: control process exit
ed, code=exited status=1
Sep 28 08:47:53 m1.dcos systemd[1]: Failed to start Jobs Service: DC/OS Metronom
e.
Sep 28 08:47:53 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:47:53 m1.dcos systemd[1]: dcos-metronome.service failed.
Sep 28 08:48:08 m1.dcos systemd[1]: dcos-metronome.service holdoff time over, sc
heduling restart.
Sep 28 08:48:08 m1.dcos systemd[1]: Starting Jobs Service: DC/OS Metronome...
Sep 28 08:48:08 m1.dcos systemd[1]: dcos-metronome.service: control process exit
ed, code=exited status=1
Sep 28 08:48:08 m1.dcos systemd[1]: Failed to start Jobs Service: DC/OS Metronom
e.
Sep 28 08:48:08 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:48:08 m1.dcos systemd[1]: dcos-metronome.service failed.
Sep 28 08:48:23 m1.dcos systemd[1]: dcos-metronome.service holdoff time over, sc
heduling restart.
Sep 28 08:48:23 m1.dcos systemd[1]: Starting Jobs Service: DC/OS Metronome...
Sep 28 08:48:23 m1.dcos systemd[1]: dcos-metronome.service: control process exit
ed, code=exited status=1
Sep 28 08:48:23 m1.dcos systemd[1]: Failed to start Jobs Service: DC/OS Metronom
e.
Sep 28 08:48:23 m1.dcos systemd[1]: Unit dcos-metronome.service entered failed s
tate.
Sep 28 08:48:23 m1.dcos systemd[1]: dcos-metronome.service failed.

[dcos-minuteman.service]: DC/OS Layer 4 Load Balancing Service unit dcos-minutem
an.service has never entered `active` state
-- Logs begin at Wed 2016-09-28 08:22:53 PDT, end at Wed 2016-09-28 08:48:37 PDT
. --
Sep 28 08:48:00 m1.dcos systemd[1]: dcos-minuteman.service failed.
Sep 28 08:48:06 m1.dcos systemd[1]: dcos-minuteman.service holdoff time over, sc
heduling restart.
Sep 28 08:48:06 m1.dcos systemd[1]: Starting Layer 4 Load Balancer: DC/OS Layer
4 Load Balancing Service...
Sep 28 08:48:06 m1.dcos systemd[1]: dcos-minuteman.service: control process exit
ed, code=exited status=1
Sep 28 08:48:06 m1.dcos systemd[1]: Failed to start Layer 4 Load Balancer: DC/OS
 Layer 4 Load Balancing Service.
Sep 28 08:48:06 m1.dcos systemd[1]: Unit dcos-minuteman.service entered failed s
tate.
Sep 28 08:48:06 m1.dcos systemd[1]: dcos-minuteman.service failed.
Sep 28 08:48:06 m1.dcos check-time[11810]: Time is marked as bad
Sep 28 08:48:11 m1.dcos systemd[1]: dcos-minuteman.service holdoff time over, sc
heduling restart.
Sep 28 08:48:11 m1.dcos systemd[1]: Starting Layer 4 Load Balancer: DC/OS Layer
4 Load Balancing Service...
Sep 28 08:48:11 m1.dcos check-time[11837]: Time is marked as bad
Sep 28 08:48:11 m1.dcos systemd[1]: dcos-minuteman.service: control process exit
ed, code=exited status=1
Sep 28 08:48:11 m1.dcos systemd[1]: Failed to start Layer 4 Load Balancer: DC/OS
 Layer 4 Load Balancing Service.
Sep 28 08:48:11 m1.dcos systemd[1]: Unit dcos-minuteman.service entered failed s
tate.
Sep 28 08:48:11 m1.dcos systemd[1]: dcos-minuteman.service failed.
Sep 28 08:48:16 m1.dcos systemd[1]: dcos-minuteman.service holdoff time over, sc
heduling restart.
Sep 28 08:48:16 m1.dcos systemd[1]: Starting Layer 4 Load Balancer: DC/OS Layer
4 Load Balancing Service...
Sep 28 08:48:16 m1.dcos check-time[11869]: Time is marked as bad
Sep 28 08:48:16 m1.dcos systemd[1]: dcos-minuteman.service: control process exit
ed, code=exited status=1
Sep 28 08:48:16 m1.dcos systemd[1]: Failed to start Layer 4 Load Balancer: DC/OS
 Layer 4 Load Balancing Service.
Sep 28 08:48:16 m1.dcos systemd[1]: Unit dcos-minuteman.service entered failed s
tate.
Sep 28 08:48:16 m1.dcos systemd[1]: dcos-minuteman.service failed.
Sep 28 08:48:21 m1.dcos systemd[1]: dcos-minuteman.service holdoff time over, sc
heduling restart.
Sep 28 08:48:21 m1.dcos systemd[1]: Starting Layer 4 Load Balancer: DC/OS Layer
4 Load Balancing Service...
Sep 28 08:48:21 m1.dcos systemd[1]: dcos-minuteman.service: control process exit
ed, code=exited status=1
Sep 28 08:48:21 m1.dcos systemd[1]: Failed to start Layer 4 Load Balancer: DC/OS
 Layer 4 Load Balancing Service.
Sep 28 08:48:21 m1.dcos systemd[1]: Unit dcos-minuteman.service entered failed s
tate.
Sep 28 08:48:21 m1.dcos systemd[1]: dcos-minuteman.service failed.
Sep 28 08:48:21 m1.dcos check-time[11898]: Time is marked as bad
Sep 28 08:48:26 m1.dcos systemd[1]: dcos-minuteman.service holdoff time over, sc
heduling restart.
Sep 28 08:48:26 m1.dcos systemd[1]: Starting Layer 4 Load Balancer: DC/OS Layer
4 Load Balancing Service...
Sep 28 08:48:26 m1.dcos check-time[11928]: Time is marked as bad
Sep 28 08:48:26 m1.dcos systemd[1]: dcos-minuteman.service: control process exit
ed, code=exited status=1
Sep 28 08:48:26 m1.dcos systemd[1]: Failed to start Layer 4 Load Balancer: DC/OS
 Layer 4 Load Balancing Service.
Sep 28 08:48:26 m1.dcos systemd[1]: Unit dcos-minuteman.service entered failed s
tate.
Sep 28 08:48:26 m1.dcos systemd[1]: dcos-minuteman.service failed.
Sep 28 08:48:31 m1.dcos systemd[1]: dcos-minuteman.service holdoff time over, sc
heduling restart.
Sep 28 08:48:31 m1.dcos systemd[1]: Starting Layer 4 Load Balancer: DC/OS Layer
4 Load Balancing Service...
Sep 28 08:48:31 m1.dcos systemd[1]: dcos-minuteman.service: control process exit
ed, code=exited status=1
Sep 28 08:48:31 m1.dcos systemd[1]: Failed to start Layer 4 Load Balancer: DC/OS
 Layer 4 Load Balancing Service.
Sep 28 08:48:31 m1.dcos systemd[1]: Unit dcos-minuteman.service entered failed s
tate.
Sep 28 08:48:31 m1.dcos systemd[1]: dcos-minuteman.service failed.
Sep 28 08:48:31 m1.dcos check-time[11962]: Time is marked as bad
Sep 28 08:48:37 m1.dcos systemd[1]: dcos-minuteman.service holdoff time over, sc
heduling restart.
Sep 28 08:48:37 m1.dcos systemd[1]: Starting Layer 4 Load Balancer: DC/OS Layer
4 Load Balancing Service...
Sep 28 08:48:37 m1.dcos check-time[12002]: Time is marked as bad
Sep 28 08:48:37 m1.dcos systemd[1]: dcos-minuteman.service: control process exit
ed, code=exited status=1
Sep 28 08:48:37 m1.dcos systemd[1]: Failed to start Layer 4 Load Balancer: DC/OS
 Layer 4 Load Balancing Service.
Sep 28 08:48:37 m1.dcos systemd[1]: Unit dcos-minuteman.service entered failed s
tate.
Sep 28 08:48:37 m1.dcos systemd[1]: dcos-minuteman.service failed.

[dcos-navstar.service]: A distributed systems & network overlay orchestration en
gine unit dcos-navstar.service has never entered `active` state
-- Logs begin at Wed 2016-09-28 08:22:53 PDT, end at Wed 2016-09-28 08:48:37 PDT
. --
Sep 28 08:48:00 m1.dcos systemd[1]: dcos-navstar.service failed.
Sep 28 08:48:06 m1.dcos systemd[1]: dcos-navstar.service holdoff time over, sche
duling restart.
Sep 28 08:48:06 m1.dcos systemd[1]: Starting Navstar: A distributed systems & ne
twork overlay orchestration engine...
Sep 28 08:48:06 m1.dcos systemd[1]: dcos-navstar.service: control process exited
, code=exited status=1
Sep 28 08:48:06 m1.dcos systemd[1]: Failed to start Navstar: A distributed syste
ms & network overlay orchestration engine.
Sep 28 08:48:06 m1.dcos systemd[1]: Unit dcos-navstar.service entered failed sta
te.
Sep 28 08:48:06 m1.dcos systemd[1]: dcos-navstar.service failed.
Sep 28 08:48:06 m1.dcos check-time[11809]: Time is marked as bad
Sep 28 08:48:11 m1.dcos systemd[1]: dcos-navstar.service holdoff time over, sche
duling restart.
Sep 28 08:48:11 m1.dcos systemd[1]: Starting Navstar: A distributed systems & ne
twork overlay orchestration engine...
Sep 28 08:48:11 m1.dcos check-time[11839]: Time is marked as bad
Sep 28 08:48:11 m1.dcos systemd[1]: dcos-navstar.service: control process exited
, code=exited status=1
Sep 28 08:48:11 m1.dcos systemd[1]: Failed to start Navstar: A distributed syste
ms & network overlay orchestration engine.
Sep 28 08:48:11 m1.dcos systemd[1]: Unit dcos-navstar.service entered failed sta
te.
Sep 28 08:48:11 m1.dcos systemd[1]: dcos-navstar.service failed.
Sep 28 08:48:16 m1.dcos systemd[1]: dcos-navstar.service holdoff time over, sche
duling restart.
Sep 28 08:48:16 m1.dcos systemd[1]: Starting Navstar: A distributed systems & ne
twork overlay orchestration engine...
Sep 28 08:48:16 m1.dcos check-time[11867]: Time is marked as bad
Sep 28 08:48:16 m1.dcos systemd[1]: dcos-navstar.service: control process exited
, code=exited status=1
Sep 28 08:48:16 m1.dcos systemd[1]: Failed to start Navstar: A distributed syste
ms & network overlay orchestration engine.
Sep 28 08:48:16 m1.dcos systemd[1]: Unit dcos-navstar.service entered failed sta
te.
Sep 28 08:48:16 m1.dcos systemd[1]: dcos-navstar.service failed.
Sep 28 08:48:21 m1.dcos systemd[1]: dcos-navstar.service holdoff time over, sche
duling restart.
Sep 28 08:48:21 m1.dcos systemd[1]: Starting Navstar: A distributed systems & ne
twork overlay orchestration engine...
Sep 28 08:48:21 m1.dcos systemd[1]: dcos-navstar.service: control process exited
, code=exited status=1
Sep 28 08:48:21 m1.dcos check-time[11899]: Time is marked as bad
Sep 28 08:48:21 m1.dcos systemd[1]: Failed to start Navstar: A distributed syste
ms & network overlay orchestration engine.
Sep 28 08:48:21 m1.dcos systemd[1]: Unit dcos-navstar.service entered failed sta
te.
Sep 28 08:48:21 m1.dcos systemd[1]: dcos-navstar.service failed.
Sep 28 08:48:26 m1.dcos systemd[1]: dcos-navstar.service holdoff time over, sche
duling restart.
Sep 28 08:48:26 m1.dcos systemd[1]: Starting Navstar: A distributed systems & ne
twork overlay orchestration engine...
Sep 28 08:48:26 m1.dcos check-time[11929]: Time is marked as bad
Sep 28 08:48:26 m1.dcos systemd[1]: dcos-navstar.service: control process exited
, code=exited status=1
Sep 28 08:48:26 m1.dcos systemd[1]: Failed to start Navstar: A distributed syste
ms & network overlay orchestration engine.
Sep 28 08:48:26 m1.dcos systemd[1]: Unit dcos-navstar.service entered failed sta
te.
Sep 28 08:48:26 m1.dcos systemd[1]: dcos-navstar.service failed.
Sep 28 08:48:31 m1.dcos systemd[1]: dcos-navstar.service holdoff time over, sche
duling restart.
Sep 28 08:48:31 m1.dcos systemd[1]: Starting Navstar: A distributed systems & ne
twork overlay orchestration engine...
Sep 28 08:48:31 m1.dcos systemd[1]: dcos-navstar.service: control process exited
, code=exited status=1
Sep 28 08:48:31 m1.dcos systemd[1]: Failed to start Navstar: A distributed syste
ms & network overlay orchestration engine.
Sep 28 08:48:31 m1.dcos systemd[1]: Unit dcos-navstar.service entered failed sta
te.
Sep 28 08:48:31 m1.dcos systemd[1]: dcos-navstar.service failed.
Sep 28 08:48:31 m1.dcos check-time[11963]: Time is marked as bad
Sep 28 08:48:37 m1.dcos systemd[1]: dcos-navstar.service holdoff time over, sche
duling restart.
Sep 28 08:48:37 m1.dcos systemd[1]: Starting Navstar: A distributed systems & ne
twork overlay orchestration engine...
Sep 28 08:48:37 m1.dcos check-time[12001]: Time is marked as bad
Sep 28 08:48:37 m1.dcos systemd[1]: dcos-navstar.service: control process exited
, code=exited status=1
Sep 28 08:48:37 m1.dcos systemd[1]: Failed to start Navstar: A distributed syste
ms & network overlay orchestration engine.
Sep 28 08:48:37 m1.dcos systemd[1]: Unit dcos-navstar.service entered failed sta
te.
Sep 28 08:48:37 m1.dcos systemd[1]: dcos-navstar.service failed.

time="2016-09-28T08:48:37-07:00" level=fatal msg="Found unhealthy systemd units"

Don't know where i am missing something. googled for it still couldn't find what exactly wrong. any help on this
Thank you


Solution

  • "Time is marked as bad" is from a new time-check process that was added in 1.8.4 to validate that time between nodes is synchronized.

    Usually, a production cluster would use NTP to synchronize node clocks, but this hasn't been baked into the dcos-vagrant base image yet (Just made a ticket, actually). VirtualBox does do some time synchronization natively, which usually works, but occasionally isn't quite good enough.

    So to workaround this temporarily, the example config.yaml was updated to use check_time: false, which disables the time check. You'll need to update your local copy.