Search code examples
chef-infraaws-opsworksknife

Knife view of nodes not in sync with Automate server


OK, so in our AWS environment we are running Chef-Automate via AWS OpsWorks. Nodes automatically register and download the Chef client via code in the AWS user data. We have another EC2 instance acting as a Chef workstation. All this is working well. However, in this particular non-production environment, EC2 nodes come and go quite often. To keep things cleaned up, we run the following cron job on the Automate server:

automate-ctl node-summary | grep missing | awk '{print $2}' |  while read var; do automate-ctl delete-node --force -d -u $var; done

This deletes any node once it shows up as "missing". This works fine as well. However, when we run knife node list or knife status on the workstation, we get hundreds of dead nodes, some of which have been gone for thousands of hours.

Clearly, knife is not getting node data from the same database as automate-ctl. What I would like, optimally, is some command I can run via cron on the Automate server to keep these in sync, but I don't see an obvious solution in the docs. I assume knife is connecting the to Automate server to get its list, so I would much prefer some solution that runs there, rather than on the workstation.

Any ideas???


Solution

  • You would run something similar using knife node bulk delete and knife client bulk delete. There are also Lambda tasks that will monitor for instance shutdown events and clean up the Chef Server.

    Chef Server and Automate communicate but each has its own database as Automate often keeps historical records even after a node is removed from the Chef Server for auditing or compliance trails.