Search code examples
rexcelregeximportreadr

List nested Excel files in R directories


I have the following directory structure that I'm working with on Windows 10.

C:
├── my-R-working-directory/
    ├── excel01.xlsx
    ├── excel02.xlsx
    ├── nested-folder-A
    |   ├── excel03.xlsx
    └── nested-folder-B
        └── excel04.xlsx

I can list all the Excel files directly within my working directory with the following command:

getwd()
#> "C:/my-R-working-directory"

list.files("path/of/folder", pattern = "\\.xlsx$", full.names = TRUE)
#> [1] "C:/my-R-working-directory/excel01.xlsx
#> [2] "C:/my-R-working-directory/excel02.xlsx

What regular expression can I use for the pattern argument to list all Excel files one nested level deep? In other words I only want to see Excel files in nested-folder-A and nested-folder-B, which would be excel03.xlsx and excel04.xlsx (and any other Excel files that are in folders that happen to be one level deep).

My thought was to use what you see below, but it returns character(0), which I assume means my regular expression is wrong:

list.files("path/of/folder", pattern = "\\\\.xlsx$", full.names = TRUE)
#> character(0)

I took the pattern = "\\.xlsx$" that looks in the working directory and tried to go one level up, ie pattern = "\\\\.xlsx$". Attempt failed. What's the correct way to do this?


Solution

  • If you looked at the recursive argument in list.files() it lets you see the files in lower level directories. In conjunction with pattern = "\\.xlsx$" will match the Excel files in all recursive directories.

    list.files("path/of/folder", pattern = "\\.xlsx$", full.names = TRUE, recursive = TRUE)

    This answer might help if you want to do level-specific recursive list files.