We have a 3 node service fabric cluster, node 0 which was the one that we used to setup the cluster is working but not listed in the System ClusterManagerService and other ones, but is in the FailoverManagerService.
How can I add it back in as I'm stumped at the moment, spend most of the day on this an no wiser?
With no answers I am thinking I will just need to remove the cluster and then recreate it.
I was unable to recover it, so remove the cluster and recreated it, with the following commands in PowerShell on one of the nodes.
Connect-ServiceFabricCluster
Get the current configuration for the cluster
Get-ServiceFabricClusterConfiguration > C:\temp\train_cluster_config_old.json
With the config we can now remove the cluster
Remove-ServiceFabricCluster -ClusterConfigFilePath train_cluster_config_old.json
You may have to remove any left over node folders under C:\ProgramData\SF or the next steps will inform you that you need to remove them.
Make sure you are happy with the config, then test it with the tools you will need on one node https://go.microsoft.com/fwlink/?LinkId=730690
.\TestConfiguration.ps1 -ClusterConfigFilePath C:\temp\train_cluster_config_old.json -FabricRuntimePackagePath C:\temp\Microsoft.Azure.ServiceFabric.WindowsServer.8.1.321.9590\DeploymentRuntimePackages\MicrosoftAzureServiceFabric.8.1.321.9590.cab
If that all succeeds then run the command that will create the cluster
.\CreateServiceFabricCluster.ps1 -ClusterConfigFilePath C:\temp\train_cluster_config_old.json -FabricRuntimePackagePath C:\temp\Microsoft.Azure.ServiceFabric.WindowsServer.8.1.321.9590\DeploymentRuntimePackages\MicrosoftAzureServiceFabric.8.1.321.9590.cab
Give it a few minutes to run and start up and then navigate to https://localhost:19080/Explorer/index.html on the nodes to makes sure its running.
You will now need to deploy all your applications again as the cluster will be empty.