Search code examples
pythonansibleperformance-testingcprofile

how to reduce {time.sleep} time in ansible playbook performance tuning


I am working on ansible playbook performance tuning with the following playbook

---
# simplespeedtest.yml
- hosts: localhost
  connection: local
  gather_facts: false
  tasks:
    - command: echo 1
    - command: echo 2
    - command: echo 30

I have test the running time of the playbook and every task in the playbook by tasts_profile modules, the result is as follow:

# time $ANSIBLE_HOME/bin/ansible-playbook -i hosts simplespeed_localhost.yml

PLAY [localhost] ********************************************************************************
TASK [command] ********************************************************************************
Thursday 11 January 2018  15:29:45 +0800 (0:00:00.091)       0:00:00.091 ****** 
changed: [localhost]

TASK [command] ********************************************************************************
Thursday 11 January 2018  15:29:45 +0800 (0:00:00.316)       0:00:00.407 ****** 
changed: [localhost]

TASK [command] ********************************************************************************
Thursday 11 January 2018  15:29:46 +0800 (0:00:00.205)       0:00:00.613 ****** 
changed: [localhost]

PLAY RECAP ********************************************************************************
localhost                  : ok=3    changed=3    unreachable=0    failed=0   

Thursday 11 January 2018  15:29:46 +0800 (0:00:00.187)       0:00:00.800 ****** 
=============================================================================== 
command ---------------------------------- 0.32s
command ----------------------------------- 0.21s
command ----------------------------------- 0.19s
Playbook run took 0 days, 0 hours, 0 minutes, 0 seconds

real    0m1.894s
user    0m1.652s
sys 0m0.220s

and total time to execute the task is 0.72s, but the total time to execute the playbook is 1.894s which is bigger the the task total time, I want to know how to reduce this time. By running the following command:

time python -m cProfile -o test1.out -s time $ANSIBLE_HOME/bin/ansible-playbook -i hosts simplespeed_localhost.yml

I got the python cprofile file

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      633    0.673    0.001    0.673    0.001 {time.sleep}
    34730    0.072    0.000    0.076    0.000 /usr/local/lib/python2.7/dist-packages/yaml/reader.py:99(forward)
     3597    0.069    0.000    0.213    0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:1272(scan_plain)
    63467    0.058    0.000    0.128    0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:142(need_more_tokens)
    38543    0.051    0.000    0.585    0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:113(check_token)
    66812    0.046    0.000    0.055    0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:276(stale_possible_simple_keys)
      153    0.042    0.000    0.093    0.001 /usr/lib/python2.7/ConfigParser.py:464(_read)
       46    0.040    0.001    0.040    0.001 {compile}
   162917    0.039    0.000    0.040    0.000 /usr/local/lib/python2.7/dist-packages/yaml/reader.py:87(peek)
    22624    0.032    0.000    0.032    0.000 /usr/local/lib/python2.7/dist-packages/yaml/error.py:6(__init__)
     4981    0.031    0.000    0.150    0.000 /usr/local/lib/python2.7/dist-packages/yaml/parser.py:273(parse_node)
  967/207    0.030    0.000    0.089    0.000 /usr/lib/python2.7/sre_parse.py:379(_parse)
     7649    0.028    0.000    0.428    0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:153(fetch_more_tokens)
      101    0.026    0.000    0.026    0.000 {method 'read' of 'file' objects}
    22624    0.026    0.000    0.058    0.000 /usr/local/lib/python2.7/dist-packages/yaml/reader.py:114(get_mark)
   100506    0.025    0.000    0.029    0.000 {isinstance}

and the call of {time.sleep} time is 0.673s, and I want to know which process of ansible will call the {time.sleep}, and how to reduce it

ps: I got 3 posotion call the {time.sleep} from the ansible source code in ./lib/ansible/executor/task_executor.py and change the arg but it seems do not work


Solution

  • You need to think more carefully about the task you're trying to accomplish. A simple playbook with trivial tasks and no SSH connections is not going to be representative of the actual performance of a real-life ansible playbook, so any profiling results you might derive are going to be pretty meaningless.

    Additionally, Ansible's tasks are executed remotely on the target machine (or for local tasks, executed in a child process). This means that the profile will not be able to measure the actual module code being executed. So simply running ansible-playbook through a profiler isn't going to show the full story.

    With that said, let's try to get more information from the profiler. You can generate a call graph from the profile output using gprof2dot, as detailed here. The graph will be colored according to the time spent in each function.

    Here is what that graph looks like when I ran the playbook on my computer. It may look different based on the performance characteristics of your system:

    profiler call graph

    Looking at the graph, we can see that time.sleep() is being called from the ansible.plugins.strategy.__init__.py module. We can see from the code that the sleep occurs while ansible is waiting for tasks to complete. You can change the polling interval in the configuration file, but the default is 0.001 seconds which is already pretty fast. It is unlikely that changing this will make a significant difference.

    To summarize:

    • You need a more realistic playbook for effective benchmarking
    • The profiler data is limited due to remote task execution
    • Changing the sleep interval is not going to improve performance significantly

    Here are some performance-related suggestions from the Ansible team that may be helpful: