Retry task on a Windows node if unreachable

Is there a way to retry a task if the Windows node is temporarily unreachable?

For example, I tried

- name: Hello
  ansible.windows.win_powershell:
    script: | 
      Write-Host "hello"
  register: _status
  until: _status is not unreachable
  retries: 3
  delay: 200

But, after 30 seconds, I got

fatal: [mylocalwin]: UNREACHABLE! => changed=false 
  msg: 'certificate: HTTPSConnectionPool(host=''xxx.xxx.xxx.xxx'', port=5986): Max retries exceeded with url: /wsman (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f4160b63eb0>, ''Connection to xxx.xxx.xxx.xxx timed out. (connect timeout=30)''))'
  unreachable: true

I would like to retry three times before failing.

Solution

Here there is my solution based on https://github.com/ansible/ansible/issues/25532#issuecomment-428386816

Modify

/lib/python3.10/site-packages/winrm/protocol.py

class Protocol(object):
    def __init__(
            ...
            reconnection_retries=0,
            reconnection_backoff_factor=2.0
        ):
        ...
        
        self.transport = Transport(
            ...      
            reconnection_retries=reconnection_retries,
            reconnection_backoff_factor=reconnection_backoff_factor
        )

/lib/python3.10/site-packages/winrm/transport.py

class Transport(object):
    def __init__(
        ...
        reconnection_retries=0,
        reconnection_backoff_factor=2.0):
        
        ...
        self.reconnection_retries = reconnection_retries
        self.reconnection_backoff_factor = reconnection_backoff_factor
        ...
        
    def build_session(self):
        ...
        
        # Merge proxy environment variables
        settings = session.merge_environment_settings(url=self.endpoint,
                      proxies=proxies, stream=None, verify=None, cert=None)
        # ADD
        # Retry on connection errors, with a backoff factor
        retries = requests.packages.urllib3.util.retry.Retry(total=self.reconnection_retries,
                                                             connect=self.reconnection_retries,
                                                             status=self.reconnection_retries,
                                                             read=0,
                                                             backoff_factor=self.reconnection_backoff_factor,
                                                             status_forcelist=(413, 425, 429, 503))
        # ADD
        session.mount('http://', requests.adapters.HTTPAdapter(max_retries=retries))
        session.mount('https://', requests.adapters.HTTPAdapter(max_retries=retries))  
        ...

Now it is possible to control the retry when the node is unreachable

- name: Test
  hosts: mylocalwin
  gather_facts: false
  vars:
    ansible_winrm_reconnection_backoff_factor: 2.0
    ansible_winrm_reconnection_retries: 4

  tasks:
    - name: Hello
      ansible.windows.win_powershell:
        script: | 
          Write-Host "hello"

I checked the solution with tcpdump and I can confirm then the TCP SYN groups are re-sent for reconnection_retries times.

Here there is a small recap about performaces

TYPE                ERROR DETECTION (sec)   NUM OF TCP SYN SENT
RETRY_0_BACKOFF_2   30                      5
RETRY_1_BACKOFF_2   60                      10
RETRY_2_BACKOFF_2   94                      15
RETRY_3_BACKOFF_2   133                     20
RETRY_4_BACKOFF_2   179                     25
RETRY_5_BACKOFF_2   240                     30
NO_RETRY_MECHANISM  30                      5