I am working on ansible playbook performance tuning with the following playbook
---
# simplespeedtest.yml
- hosts: localhost
connection: local
gather_facts: false
tasks:
- command: echo 1
- command: echo 2
- command: echo 30
I have test the running time of the playbook and every task in the playbook by tasts_profile modules, the result is as follow:
# time $ANSIBLE_HOME/bin/ansible-playbook -i hosts simplespeed_localhost.yml
PLAY [localhost] ********************************************************************************
TASK [command] ********************************************************************************
Thursday 11 January 2018 15:29:45 +0800 (0:00:00.091) 0:00:00.091 ******
changed: [localhost]
TASK [command] ********************************************************************************
Thursday 11 January 2018 15:29:45 +0800 (0:00:00.316) 0:00:00.407 ******
changed: [localhost]
TASK [command] ********************************************************************************
Thursday 11 January 2018 15:29:46 +0800 (0:00:00.205) 0:00:00.613 ******
changed: [localhost]
PLAY RECAP ********************************************************************************
localhost : ok=3 changed=3 unreachable=0 failed=0
Thursday 11 January 2018 15:29:46 +0800 (0:00:00.187) 0:00:00.800 ******
===============================================================================
command ---------------------------------- 0.32s
command ----------------------------------- 0.21s
command ----------------------------------- 0.19s
Playbook run took 0 days, 0 hours, 0 minutes, 0 seconds
real 0m1.894s
user 0m1.652s
sys 0m0.220s
and total time to execute the task is 0.72s, but the total time to execute the playbook is 1.894s which is bigger the the task total time, I want to know how to reduce this time. By running the following command:
time python -m cProfile -o test1.out -s time $ANSIBLE_HOME/bin/ansible-playbook -i hosts simplespeed_localhost.yml
I got the python cprofile file
ncalls tottime percall cumtime percall filename:lineno(function)
633 0.673 0.001 0.673 0.001 {time.sleep}
34730 0.072 0.000 0.076 0.000 /usr/local/lib/python2.7/dist-packages/yaml/reader.py:99(forward)
3597 0.069 0.000 0.213 0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:1272(scan_plain)
63467 0.058 0.000 0.128 0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:142(need_more_tokens)
38543 0.051 0.000 0.585 0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:113(check_token)
66812 0.046 0.000 0.055 0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:276(stale_possible_simple_keys)
153 0.042 0.000 0.093 0.001 /usr/lib/python2.7/ConfigParser.py:464(_read)
46 0.040 0.001 0.040 0.001 {compile}
162917 0.039 0.000 0.040 0.000 /usr/local/lib/python2.7/dist-packages/yaml/reader.py:87(peek)
22624 0.032 0.000 0.032 0.000 /usr/local/lib/python2.7/dist-packages/yaml/error.py:6(__init__)
4981 0.031 0.000 0.150 0.000 /usr/local/lib/python2.7/dist-packages/yaml/parser.py:273(parse_node)
967/207 0.030 0.000 0.089 0.000 /usr/lib/python2.7/sre_parse.py:379(_parse)
7649 0.028 0.000 0.428 0.000 /usr/local/lib/python2.7/dist-packages/yaml/scanner.py:153(fetch_more_tokens)
101 0.026 0.000 0.026 0.000 {method 'read' of 'file' objects}
22624 0.026 0.000 0.058 0.000 /usr/local/lib/python2.7/dist-packages/yaml/reader.py:114(get_mark)
100506 0.025 0.000 0.029 0.000 {isinstance}
and the call of {time.sleep} time is 0.673s, and I want to know which process of ansible will call the {time.sleep}, and how to reduce it
ps: I got 3 posotion call the {time.sleep} from the ansible source code in ./lib/ansible/executor/task_executor.py and change the arg but it seems do not work
You need to think more carefully about the task you're trying to accomplish. A simple playbook with trivial tasks and no SSH connections is not going to be representative of the actual performance of a real-life ansible playbook, so any profiling results you might derive are going to be pretty meaningless.
Additionally, Ansible's tasks are executed remotely on the target machine (or for local tasks, executed in a child process). This means that the profile will not be able to measure the actual module code being executed. So simply running ansible-playbook
through a profiler isn't going to show the full story.
With that said, let's try to get more information from the profiler. You can generate a call graph from the profile output using gprof2dot
, as detailed here. The graph will be colored according to the time spent in each function.
Here is what that graph looks like when I ran the playbook on my computer. It may look different based on the performance characteristics of your system:
Looking at the graph, we can see that time.sleep()
is being called from the ansible.plugins.strategy.__init__.py
module. We can see from the code that the sleep occurs while ansible is waiting for tasks to complete. You can change the polling interval in the configuration file, but the default is 0.001 seconds which is already pretty fast. It is unlikely that changing this will make a significant difference.
To summarize:
Here are some performance-related suggestions from the Ansible team that may be helpful: