Search code examples
pythonrecursiontreeviewpathlib

Traverse directories recursively and return a nested list with the subdirectories and files in Python


I would like to recursively traverse a directory in Python and get a nested list of all the children directories and files. I have found dozens of solutions out there to solve the first part (recursively traverse directories), but none of them allow me to get the output in the format that I need.

There are no restrictions/preferences for which libraries to use. I tried with pathlib, but os.walk() is just fine, too. Also, it doesn't have to be a recursive function. A loop is fine.

I have the following structure:

root
├── file1.txt
├── file2.txt
├── sub1
│   ├── subfile1.txt
│   └── subsub
│       └── subsubfile1.txt
└── sub2

And I need the result to be a nested list like so:

[
  {
    'name': 'file1.txt'
  },
  {
    'name': 'file2.txt'
  },
  {
    'name': 'sub1',
    'children': [
      {
        'name': 'subfile1.txt'
      },
      {
        'name': 'subsub',
        'children': [
          {
            'name': 'subsubfile1.txt'
          }
        ]
      }
    ]
  },
  {
    'name': 'sub2'.
    'children': []
  }
]

This is how far I've gotten, but it doesn't give the correct results:

from pathlib import Path
def walk(path: Path, result: list) -> list:
    for p in path.iterdir():
        if p.is_file():
            result.append({
                'name': p.name
            })
            yield result
        else:
            result.append({
                'name': p.name,
                'children': list(walk(p, result))
            })
walk(Path('root'), [])  # initial call

Besides the fact that this code doesn't work, I also get a problem with the recursive collection. When I try to pretty print it, it shows:

'children': [ <Recursion on list with id=4598812496>,
                    <Recursion on list with id=4598812496>],
      'name': 'sub1'},

Is it possible to get that Recursion object as a list?

If anyone's wondering why I need that structure rather than a flat list like the one returned by pathlib.glob(), it's because this list will be consumed by this code on the other side of my API: https://vuetifyjs.com/en/components/treeview/#slots


Solution

  • You can use os.listdir in recursion:

    import os
    def to_tree(s=os.getcwd()):
      return [{'name':i} if os.path.isfile(f'{s}/{i}') else 
                  {'name':i, 'children':to_tree(f'{s}/{i}')} for i in os.listdir(s)]
    

    When running the function above on a similar file structure as your example, the result is:

    import json
    print(json.dumps(to_tree(), indent=4))
    

    Output:

    [
      {
        "name": "file1.txt"
      },
      {
        "name": "file2.txt"
      },
      {
        "name": "sub1",
        "children": [
            {
                "name": "subfile1.txt"
            },
            {
                "name": "subsub",
                "children": [
                    {
                        "name": "subsubfile1.txt"
                    }
                ]
             }
          ]
      },
      {
        "name": "sub2",
        "children": []
      }
    ]