My goal is to upgrade Service Fabric VMSS OS from 2016 to 2019.
Followed the Microsoft document on Scale up a Service Fabric cluster primary node type
Facing the following issue
Once the VMSS is part of the service fabric cluster, will be disabling the windows 2012 nodes scale set
Any idea? (or) any other alternative to performing VMSS OS upgrade from windows 2016 to windows 2019
Microsoft reference link on Scale up a Service Fabric cluster primary node type
My findings on the above query. I have successfully upgrade Service Fabric Cluster VMSS OS from 2016 to 2019
-In the ARM template new newly created VMSS is not part of Service Fabric Cluster. Following changes performed under nodeTypes
"managementEndpoint": "[concat('https://',reference(concat(parameters('lbIPName'),'-','0')).dnsSettings.fqdn,':',parameters('nt0fabricHttpGatewayPort'))]",
"nodeTypes": [
{
"name": "[parameters('vmNodeType2Name')]",
"applicationPorts": {
*
*
},
When you deploy the ARM template with the above changes, the newly created VMSS will be part of the existing service fabric cluster.
-Connect service fabric cluster using following command
$clusterName = "Cluser-URL:19000"
$thumb = "xxxxxxxxxxx"
Connect-ServiceFabricCluster `
-ConnectionEndpoint $clusterName `
-KeepAliveIntervalInSec 10 `
-X509Credential `
-ServerCertThumbprint $thumb `
-FindType FindByThumbprint `
-FindValue $thumb `
-StoreLocation CurrentUser `
-StoreName My
-Disable service fabric cluster node which needs to delete (i.e 2016 VMSS)
$nodeNames = @("_NTvm1_0","_NTvm1_1","_NTvm1_2","_NTvm1_3","_NTvm1_4")
Write-Host "Disabling nodes..."
foreach($name in $nodeNames){
Disable-ServiceFabricNode -NodeName $name -Intent RemoveNode -Force
}
By successfully executing the above command initially, nodes will be Disabling status after some time it will be Disabled status. This can be monitored using service fabric explorer
-The next step is to remove the VMSS which be disabled in our previous step
$scaleSetName = "NTvm1"
$resourceGroupName = "RG-NAME"
Remove-AzVmss `
-ResourceGroupName $resourceGroupName `
-VMScaleSetName $scaleSetName `
-Force
Write-Host "Removed scale set $scaleSetName"
-By this time service fabric explorer ends with page not found error. Don't panic. Need to change the load balance settings to the newly created VMSS
$lbname="Newly Created LB Name"
$oldPublicIpName="Old LB PublicIP"
$newPublicIpName="New LB PublicIP"
$oldprimaryPublicIP = Get-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname
$primaryDNSName = $oldprimaryPublicIP.DnsSettings.DomainNameLabel
$primaryDNSFqdn = $oldprimaryPublicIP.DnsSettings.Fqdn
Remove-AzLoadBalancer -Name $lbname -ResourceGroupName $groupname -Force
Remove-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname -Force
-Need to update the DNS settings
settings of Public IP address related to old Primary Node Type
$PublicIP = Get-AzPublicIpAddress -Name $newPublicIpName -ResourceGroupName $groupname
$PublicIP.DnsSettings.DomainNameLabel = $primaryDNSName
$PublicIP.DnsSettings.Fqdn = $primaryDNSFqdn
Set-AzPublicIpAddress -PublicIpAddress $PublicIP
Once this is done we are good to go
-Check the service fabric health status using Get-ServiceFabricClusterHealth command
NOTE Make sure your cluster reliability level set to "Silver". Microsoft recommending this for the production environment.