Search code examples
airflowgitignore

Ignore dags in subfolders using .airflowignore


I have the following dir structure:

   .
├── project
│   ├── dag_1
│   │   ├── dag
│   │   │   ├── current
│   │   │       └── dag_1_v2.py
│   │   │   └── deprecated
│   │   │       └── dag_1_v1.py
│   │   └── sparkjobs
│   │       ├── current
│   │       └── deprecated
│   └── dag_2
│       ├── dag
│       │   ├── current
│       │       └── dag_2_v2.py
│       │   └── deprecated
│       │       └── dag_2_v1.py
│       └── sparkjobs
│           ├── current
│           └── deprecated

I want to ignore all deprecated folders, so I used .airflowignore to do that. When I place .airflowignore with */deprecated inside dag_1 or dag_2 folder, Airflow ignores the deprecated dag, like:

├── project
│   ├── dag_1
│   │   ├── .airflowignore
│   │   ├── dag
│   │   │   ├── current
│   │   │       └── dag_1_v2.py
│   │   │   └── deprecated
│   │   │       └── dag_1_v1.py

Considering this, I'll have to place a .airflowignore inside each dag folder. When I try to put onlny one .airflowignore using **/**/deprecated in the project folder the deprecated dags returns to Airflow, like:

├── project
│   ├── .airflowignore
│   ├── dag_1
│   │   ├── dag
│   │   │   ├── current
│   │   │       └── dag_1_v2.py
│   │   │   └── deprecated
│   │   │       └── dag_1_v1.py

My question is: How can I have only one .airflowignore in the project dir level to ignore all deprecated folders inside each dag_n/dag folder? Is this possible?


Solution

  • .airflowignore has same logic as .gitignore so what ever solution applies to .gitignore will also work here.

    I believe what you are after is just

    deprecated/
    

    on the top level.

    See also ignoring any 'bin' directory on a git project