How do I stop Azure Service Fabric compose application upgrade which is failing and never timing out?
Upgrade details below which does not have timeout set. I know what the issue with application is (no registry username/password specified) but I can not cancel current upgrade.
UPGRADE DETAILS
Name fabric:/planet
Type Name Compose_5
Target Application Type Version v7
Upgrade Domains
Name State
UD0 InProgress
UD1 Pending
UD2 Pending
Upgrade State RollingForwardInProgress
Next Upgrade Domain UD1
Rolling Upgrade Mode UnmonitoredAuto
Upgrade Description
Name fabric:/planet
Target Application Type Version v7
Upgrade Kind Rolling
Rolling Upgrade Mode UnmonitoredAuto
Upgrade Replica Set Check Timeout In Seconds 4294967295
Force Restart false
Monitoring Policy
Failure Action Manual
Health Check Wait Duration 0.00:00:00.0
Health Check Stable Duration 0.00:02:00.0
Health Check Retry Timeout 0.00:10:00.0
Upgrade Timeout Infinity
Upgrade Domain Timeout Infinity
Upgrade Duration 0.00:21:01.241.0700000000652
Upgrade Domain Duration 0.00:21:01.241.0700000000652
Current Upgrade Domain Progress
Domain Name UD0
Node Upgrade Progress List
Node Name Upgrade Phase Pending Safety Checks
CONTAINERHOST1 Upgrading (empty)
Start Timestamp Utc Fri, 03 Aug 2018 02:20:34 GMT
Failure Timestamp Utc N/A
Failure Reason None
Because you've set the failure mode to manual, the cluster will be stuck waiting your action.
You could try Start-ServiceFabricApplicationRollback or Resume-ServiceFabricApplicationUpgrade to continue.
The recomended approach to upgrade a compose is using the parameters -Monitored -FailureAction Rollback
Start-ServiceFabricComposeDeploymentUpgrade -DeploymentName mydeployment -Compose docker-compose.yml -Monitored -FailureAction Rollback
-Monitored -FailureAction Rollback
Unless it was expected to do this manual intervention, Service Fabric should handle it by itself if the upgrade parameters are configure correctly.
Fixing these settings might solve your problem:
Rolling Upgrade Mode
is set to UnmonitoredAuto, it automate the upgrade and failure check but does not do HealthCheck. Consider using Monitored
Upgrade Domain Timeout
and Upgrade Timeout
are set to Infinity,
they should have a timeout set, otherwise it will wait forever.
Failure Action
is set to manual, the upgrade is being suspended to allow you to investigate the deployment before taking any further action. Consider using Rollback instead.
You might have to configure other parameters as well. To understand these parameters, take a look in here and here. For compose deployment, check this: