Search code examples
statadata-manipulationstata-macros

How to acquire complete list of subdirs (including subdirs of subdirs)?


I have thousands of city folders (for example city1, city2, and so on, but in reality named like NewYork, Boston, etc.). Each folder further contains two subfolders: land and house.

So the directory structure is like:

current dictionary
  ---- city1
     ----- house 
         ------ many .xlsx files
     ----- land
  ----- city2
  ----- city3
  ···
  ----- city1000

I want to get the complete list of all subdirs and do some manipulation (like import excel). I know there is a macro extended function: local list: dir to handle this issue, but it seems it can only return the first tier of subdirs, like city_i, rather than those deeper ones.

More specifically, if I want to take action within all house folders, what kind of workflow do I need?

I have made an initial attempt to write code to achieve my goal:

cd G:\Data_backup\Soufang_data
local folder: dir . dirs "*"
foreach i of local folder {
     local `i'_house : dir  "G:\Data_backup\Soufang_data\``i''\house" files "*.xlsx"

     local count = 1
     foreach j of local `i'_house {
        cap import excel "`j'",clear
        cap sxpose,clear
        cap drop in 1/1

        if `count'==1 {
          save `i'.dta, replace
            }
        else          {
         cap qui append using `i'
         save `i'.dta,replace
            }

       local ++count
     }
}

There is something wrong with:

``i'' 

in the dir, I struggled to make it work without success, anyway.

I have another post on this project.


Supplementary remarks:

As Nick points out, it's the back slash that causes the trouble. Moving from that point, however, I encounter another problem. Say, without the complicated actions, I just want to test if my loops work, so I write the following code snippet:

set more off
cd G:\Data_backup\Soufang_data
local folder: dir . dirs "*"
foreach i of local folder {
     di "`i'"
     local `i'_house : dir  "G:\Data_backup\Soufang_data/`i'\house" files "*.xlsx"

     foreach j of local `i'_house {
        di "`j'"
     }
}

However, the outcome on the screen is something like:

city1
project100
project99
······
project1

It seems the code only loops one round, over the first city, but fails to come to city2, city3 and so on. I suspect it's due to my problematic writing of the local, especially in this line but I'm not sure:

foreach j of local `i'_house

Solution

  • Although not a solution to whatever problem you're actually presenting, an easier way might be to use filelist, from SSC (ssc install filelist).

    An example might be:

    . // list all files
    . filelist, directory("D:\Datos\RFERRER\Desktop\example")
    Number of files found = 5
    
    . 
    . // strange way of tagging directories ending in "\house"
    . // change at will
    . gen tag = substr(reverse(dirname),1,6) == "esuoh/"
    
    . 
    . order tag
    
    . list
    
         +----------------------------------------------------------------------------------------------+
         | tag   dirname                                                     filename             fsize |
         |----------------------------------------------------------------------------------------------|
      1. |   0   D:\Datos\RFERRER\Desktop\example/proj_1                     newfile.txt              0 |
      2. |   1   D:\Datos\RFERRER\Desktop\example/proj_2/house               somefile.txt             0 |
      3. |   0   D:\Datos\RFERRER\Desktop\example/proj_3/subproj_3_2         newfile2.txt             0 |
      4. |   1   D:\Datos\RFERRER\Desktop\example/proj_3/subproj_3_2/house   anothernewfile.txt       0 |
      5. |   1   D:\Datos\RFERRER\Desktop\example/proj_3/subproj_3_2/house   someotherfile.txt        0 |
         +----------------------------------------------------------------------------------------------+
    

    Afterwards, use keep or drop, conditional on variable tag.

    Graphically, the directory looks like:

    enter image description here

    (I'm on Stata 13. Check help string functions for other ways to tag.)