Search code examples
rdataframesaverenamenames

Matching names of elements in a list to filenames & renaming variables in R


So I wrote a script to make changes to dataframes, however I've come across a few issues I can't seem to solve. First of all, the part where I want to rename column variable mac_sector to sector, doesn't seem to work, it doesn't rename anything, nor does it give an error.

Also when I save the modified datasets, they are simply called 1,2,3... etc. However, I actually just want them to have the same name they originally had. I tried to do this by "names(dflist)[i] <- gsub("\\.dta$", "", files)", but that doesn't work.

It gives these warning messages as well, although I do not know if they have an actual effect on the files, as I've seen no complications: Warning messages:

1: In save.dta13(dflist[[i]], paste0(i, ".dta")) :
  Number of variable labels does not match number of variables.
            Variable labels dropped.
2: In save.dta13(dflist[[i]], paste0(i, ".dta")) :
  Number of variable labels does not match number of variables.
            Variable labels dropped.
3: In save.dta13(dflist[[i]], paste0(i, ".dta")) :
  Number of variable labels does not match number of variables.
            Variable labels dropped. 

Lastly, is there a way to save files to other directory than your working directory?

My script:

setwd("C:\\....")

    files = list.files(pattern="*.dta") 
    dflist <- list()

    for (i in 1:length(files)){
      dflist[[i]] <- read.dta13(files[i], nonint.factors = TRUE)


      if("mac_sector" %in% colnames(dflist[[i]])){            #rename mac_sector to sector if present   
        rename(dflist[[i]], c(mac_sector="sector"))}

      if(!("sector" %in% colnames(dflist[[i]]))){             #This creates "sector" variable if it doesn't exist already.
        dflist[[i]]$sector <- "total"}


      names(dflist)[i] <- gsub("\\.dta$", "", files)          #Matching the names of the elements to the filenames

      save.dta13(dflist[[i]], paste0(i, ".dta"))              #Saving dataset
    }

Input: dataframe 1:

country     SA year          DV       VI     DI       DIV     DIV_s  DIV_p                  t            ta               
1  AUSTRIA   NA 2001         0      NA       NA      NA     NA       NA                  0               NA
2  AUSTRIA   NA 2002         0      NA       NA      NA     NA       NA                  0               NA
3  AUSTRIA   NA 2003         0      NA       NA      NA     NA       NA                  0               NA
4  AUSTRIA   NA 2004         0      NA       NA      NA     NA       NA                  0               NA
5  AUSTRIA   NA 2005         0      NA       NA      NA     NA       NA                  0               NA

dataframe 2:

country      mac_sector      SA year          DV       VI     DI       DIV     DIV_s  DIV_p                  t            ta 
1  BELGIUM     ing            0 2001         0      NA       NA      NA     NA       NA               3036       0.09725133
2  BELGIUM     ing            0 2002         0      NA       NA      NA     NA       NA               2970       0.09641831
3  BELGIUM     ing            0 2003         0      NA       NA      NA     NA       NA               2917       0.09791633
4  BELGIUM     ing            0 2004         0      NA       NA      NA     NA       NA               2907       0.10297798
5  BELGIUM     ing            0 2005         0      NA       NA      NA     NA       NA               2904       0.10182869

dataframe 3:

country                       sector SA year          DV       VI     DI       DIV     DIV_s  DIV_p                  t            ta
1  BELGIUM                        prod     0 2001         0      NA       NA      NA     NA       NA                392       0.09688306
2  BELGIUM                        prod     0 2002         0      NA       NA      NA     NA       NA                398       0.09394456
3  BELGIUM                        prod     0 2003         0      NA       NA      NA     NA       NA                394       0.09536502
4  BELGIUM                        prod     0 2004         0      NA       NA      NA     NA       NA                404       0.10367264
5  BELGIUM                        prod     0 2005         0      NA       NA      NA     NA       NA                407       0.08961585

Solution

  • Try this, no need for the plyr library anymore, should be able to rename and save as file names you wanted to a new location:

    setwd("C:\\...")
    files = list.files(pattern="*.dta") 
    dflist <- list()
    
    for (i in 1:length(files)){
      dflist[[i]] <- read.dta13(files[i],header=TRUE)
    
      if("mac_sector" %in% colnames(dflist[[i]])){            #rename mac_sector to sector if present   
        names(dflist[[i]])[names(dflist[[i]])=="mac_sector"] <- "sector"
        #rename(dflist[[i]], replace = c("mac_sector"="sector"))}
    
      if(!("sector" %in% colnames(dflist[[i]]))){             #This creates "sector" variable if it doesn't exist already.
        dflist[[i]]$sector <- "total"}
    
    
    names(dflist)[i] <- gsub("\\.dta$", "", files[i])          #Matching the names of the elements to the filenames
    
    save.dta13(dflist[[i]],paste0("C:\\...\\newlocation\\",names(dflist)[i], ".dta"))              #Saving dataset
    }