I have two AWS instances:
production-01
docker-machine-master
I ssh into docker-machine-master
and run docker stack deploy -c deploy/docker-compose.yml --with-registry-auth production
and i get this error:
this node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again
My guess is the swarm manager went down at some point and this new instance spun up some how keeping the same information/configuration minus the swarm manager info. Maybe the internal IP changed or something. I'm making that guess because the launch times are different by months. The production-01
instance was launched 6 months earlier. I wouldn't know because I am new to AWS, Docker, & this project.
I want to deploy code changes to the production-01
instance but I don't have ssh keys to do so. Also, my hunch is that production-01
is a replica noted in the docker-compose.yml file.
I'm the only dev on this project so any help would be much appreciated.
Here's a copy of my docker-compose.yml file with names changed.
version: '3'
services:
database:
image: postgres:10
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=pass
deploy:
replicas: 1
volumes:
- db:/var/lib/postgresql/data
aservicename:
image: 123.456.abc.amazonaws.com/reponame
ports:
- 80:80
depends_on:
- database
environment:
DB_HOST: database
DATA_IMPORT_BUCKET: some_sql_bucket
FQDN: somedomain.com
DJANGO_SETTINGS_MODULE: name.settings.production
DEBUG: "true"
deploy:
mode: global
logging:
driver: awslogs
options:
awslogs-group: aservicename
cron:
image: 123.456.abc.amazonaws.com/reponame
depends_on:
- database
environment:
DB_HOST: database
DATA_IMPORT_BUCKET: some_sql_bucket
FQDN: somedomain.com
DOCKER_SETTINGS_MODULE: name.settings.production
deploy:
replicas: 1
command: /name/deploy/someshellfile.sh
logging:
driver: awslogs
options:
awslogs-group: cron
networks:
default:
driver: overlay
ipam:
driver: default
config:
- subnet: 192.168.100.0/24
volumes:
db:
driver: rexray/ebs
I'll assume you only have the one manager, and the production-01
is a worker.
If docker info
shows Swarm: inactive
and you don't have backups of the Swarm raft log, then you'll need to create a new swarm with docker swarm init
.
Be sure it has the rexray/ebs driver by checking docker plugin ls
. All nodes will need that plugin driver to use the db volume.
If you can't SSH to production-01
then there will be no way to have it leave and join the new swarm. You'd need to deploy a new worker node and shutdown that existing server.
Then you can docker stack deploy
that app again and it should reconnect the db volume.
Note 1: Don't redeploy the stack on new servers if it's still running on the production-01
worker, as it would fail because the ebs volume for db will still be connected to production-01
.
Note 2: It's best in anything beyond learning, you run three managers (managers are also workers by default). That way if one node gets killed, you still have a working solution.