Ansible folder structure

I'm coming from a Puppet background using Vagrant and have some trouble making sense of Ansible and its differences.

My Puppet structure looked like this:

puppet
├── servers
│   └── Backend
│       └── Vagrantfile
└── src
    ├── manifests
    │   └── nodes
    │       └── development
    │           └── backend.pp
    └── modules
        └── mysql

Setup was simple, cd to the Vagrantfile and fire up the VM with Vagrant.

Now this is my first draft of an Ansible folder structure:

ansible
├── servers
│   └── Backend
│       ├── Vagrantfile
│       └── ansible.cfg
└── sources
    ├── backend.yml
    ├── site.yml
    ├── inventories
    │   └── development
    │       ├── group_vars
    │       │   ├── all
    │       │   └── backend
    │       └── hosts
    ├── playbooks
    └── roles
        └── mysql

Following questions now:

Is this best practise for Ansible or too close to Puppet?
Is it correct to treat backend.yml like a Puppet node manifest?
Where should I put site.yml and backend.yml? This example has them in the main directory while here it's in the 'plays' directory. What's the difference?
I think my group_vars in group_vars/backend aren't being used correctly, how do I access them?

Sources:

http://leucos.github.io/ansible-files-layout/

https://github.com/ansible/ansible-examples

https://github.com/enginyoyen/ansible-best-practises

Solution

In my case, I use following structures according environment complexity (check directory-layout) :

Simple environment

I use this structure when there is one environment or when I use playbooks in provision mode

ansible
├── inventory
│   ├── hosts
│   └── group_vars
│       └── my_group.yml
├── roles
│   └── mysql
├── ansible.cfg
├── README.md
├── playbook1.yml
└── playbook2.yml

In ansible.cfg, I use variable inventory = ./inventory in [default] in order to avoid setting inventory path with commands ansible-*.

Medium/Complex environment

I use this structure when there are more than one environment

ansible
├── inventories
│   ├── production
│   │   ├── hosts
│   │   └── group_vars
│   │       └── my_group.yml
│   └── development
│       ├── hosts
│       └── group_vars
│           └── my_group.yml
├── playbooks
│   ├── playbook1
│   │   ├── group_vars
│   │   │   └── my_group.yml
│   │   ├── roles
│   │   │   └── mysql
│   │   ├── README.md
│   │   └── site.yml
│   ...
├── README.md
└── ansible.cfg

In this case, there is a folder for each environments in ./inventories.

I prefer also to use a specific folder for each playbooks in order to be able to use easily a folder group_vars at playbook level as defined in variable precedence section. As environment become more complex, there are much more variables. A group_vars (and host_vars) in playbooks allows to defines common variables for all environment which makes there are less inventories variables.

Ultimate level environment

I already used Ansible to adresse systems with more than 5000 servers, here under some tips for adressing more complex environments :

Split inventory file

Use multiple files to define your inventory servers instead of a single hosts file. In this case hosts file contains only server's names, and other files contains groups with different perspectives :

└── production
    ├── hosts
    ├── middleware
    └── trigram

middleware: Groups with mapping to used middlewares or other stuf. I use this file to map, for example, servers to tomcat, java, postgresql, etc. And I use it, for example, with playbooks that deploys monitoring agents : How to retrieve metrics, logs from tomcat, java, postgresql, etc.
trigram: On my project, I usually use codes with a fixed length (3 or 4) to identify my business components (ex. : 'STK' for stock management) then I create a group file to map a business component to my servers (which servers are used to deploy 'STK')

When you create a new playbook, choose your perspective to address different environments.

Caution : I think ansible load files with alphabetical name order, you can't define a group that refers a group not yet loaded.

Use folders for `group_vars`

In group_vars, instead of using files, you can use a folder with subfiles :

└── production
    └── group_vars
        └── my_group
            ├── vars1.yml
            └── vars2.yml

This is usefull to split huge files or if you have tools that generate variables, in this case you have vars1.yml under git and vars2.yml is generated

Split git repo

When you are using ansible for huge system, there is a lot of commits and a question comes up often : How to split my huge git repo ?

I my case, I use one git repo for each folder in ./inventories with differents access rules. And a git repo for each folder in ./playbooks also with different access rules.