Search code examples
excelloopsimportstata

Extension of Excel files remains when doing an import loop in Stata


I don't understand why when running the following simple piece of code my files get saved as xxx.xls.dta instead of xxx.dta and how to fix that.

clear all
cd "C:\Users\User\Documents\PhD\Research\Data\acqua_costiera"
global RawAcqua "C:\Users\User\Documents\PhD\Research\Data\acqua_costiera"

local acquacost : dir "C:\Users\User\Documents\PhD\Research\Data\acqua_costiera" files "*.xls", respectcase

foreach f of local acquacost {
    import excel using "$RawAcqua\\`f'", cellrange(B4:N43310) firstrow clear
    save "`f'.dta", replace
}

I have tried looking at other similar pieces of code online but I don't really get what is my mistake.

I'm looping over all of the files in that directory, but the files are named "acqua_costiera_`year'" so I guess some other loop might work too.


Solution

  • Your problem can be easily illustrated using the following toy example:

    local acquacost one.xls two.xls three.xls
    
    foreach f of local acquacost {
        display "`f'.dta"
    }
    
    one.xls.dta
    two.xls.dta
    three.xls.dta
    

    You need to tell Stata to only keep the filename, not the extension:

    foreach f of local acquacost {
        display "`= substr("`f'", 1, strpos("`f'", ".") - 1)'.dta"
    }
    

    Here i use the strpos() function to get the position of the first period and then use this as a reference point to extract the relevant portion of the string using the substr() function.

    For saving the file with the proper name just use save, instead of display.