Search code examples
azuredockermarathondcos

SSH into a DC/OS created public agent node to deploy a Docker container?


Q: I wish to SSH into a DC/OS public agent to mount my file share with Docker credentials, so I can deploy Docker via Marathon. How can I ssh directly into this agent node? Without going through master?


Backstory: I did a vanilla DC/OS installation on Azure. I got two nodes provisioned (a master and an agent). I installed Marathon on the master.

First, I tried to deploy a container from an image/repo I created on Azure Container Registry, through Marathon. It failed because of CPU resource not being satisfied; that's partly understandable because it seems like Marathon sucks up the entire CPU of the master node. But I couldn't figure out how to make Marathon notice that there was another node around - the public agent node. The public agent node is running nothing.

Second, I figured I can just use the "Service" interface provided on the DC/OS layer itself (which I believe is just a UI layer for marathon or similar).

This time, it accurately recognizes the agent node and that there is compute available on it. But to make it pull from my private registry, I need to put my Docker credentials on this node. Here's where I get stuck. can't SSH to the agent node to mount the shared storage (which is mounted on the master already). Since this node is provisioned through the virtual machine scale set, I really can't figure out the right inbound NAT rules and network security configuration to map to this node and get me a reliable FQDN and port that will allow me to SSH in and run cifs. Honestly, DC/OS should have taken care of this for me, since I am doing the most standard thing.

I tried this, but it isn't sufficient/correct (even though it creates the rule):

az network lb inbound-nat-rule create --resource-group production --lb-name <lb-name> --name NATRule --protocol TCP --frontend-port 2200 --backend-port 22

(All elaborate VMSS videos from Microsoft are for the old interface, and this idea of port range mapping, which I can't seem to figure out from the CLI. Plus, the portal is still in progress when it comes to inbound NAT rules)

I am new to the Azure and DC/OS world (moving resources from AWS), so I'd appreciate the help.


UPDATE: Fwiw, turns out I tried the in-preview DC/OS on Azure service, as opposed to DC/OS on Azure Container Service, which is slightly unstable still. Launch containers through the "Services" interface on main DC/OS instead of on Marathon.


Solution

  • I really can't figure out the right inbound NAT rules and network security configuration to map to this node and get me a reliable FQDN and port that will allow me to SSH in and run cifs.

    For now, we can add inbound rule to VMss load balancer via CLI 2.0, but we can't use CLI 2.0 to sepcify target NIC, so we can't use NAT to ssh VMss instances. enter image description here

    If you only one instance in this VMSS, we can add a load balancer rule to ssh it. Add probe of port 22, and add load balancer rule of 22, after that we can ssh the VMSS public IP address with port 22.

    Another way, to login the DCOS node, we can via master ssh to other nodes. For example, we can ssh to master then ssh to public agent. enter image description here Here a case talk about how to login DCOS agent via master, please refer to it.

    After that, we can follow this article to mount Azure file share to your cluster nodes.

    By the way, we can create container via DC/OS UI, please refer to it.