Search code examples
rcolumnsorting

Ordering a subset of columns by date r


I have a data frame which part of the columns are not in the correct order (they are dates). See:

data1989 <- data.frame("date_fire" = c("1987-02-01", "1987-07-03", "1988-01-01"), 
                       "Foresttype" = c("oak", "pine", "oak"),
                       "meanSolarRad" = c(500, 550, 450),
                       "meanRainfall" = c(600, 300, 450),
                       "meanTemp" = c(14, 15, 12),
                       "1988.01.01" = c(0.5, 0.589, 0.66), 
                       "1986.06.03" = c(0.56, 0.447, 0.75), 
                       "1986.10.19" = c(0.8, NA, 0.83),
                       "1988.01.19" = c(0.75, 0.65,0.75), 
                       "1986.06.19" = c(0.1, 0.55,0.811),
                       "1987.10.19" = c(0.15, 0.12, 0.780),
                       "1988.01.19" = c(0.2, 0.22,0.32), 
                       "1986.06.19" = c(0.18, 0.21,0.23),
                       "1987.10.19" = c(0.21, 0.24, 0.250),
                       check.names = FALSE,
                       stringsAsFactors = FALSE) 

> data1989
   date_fire Foresttype meanSolarRad meanRainfall meanTemp 1988.01.01 1986.06.03 1986.10.19 1988.01.19 1986.06.19 1987.10.19 1988.01.19 1986.06.19 1987.10.19
1 1987-02-01        oak          500          600       14      0.500      0.560       0.80       0.75      0.100       0.15       0.20       0.18       0.21
2 1987-07-03       pine          550          300       15      0.589      0.447         NA       0.65      0.550       0.12       0.22       0.21       0.24
3 1988-01-01        oak          450          450       12      0.660      0.750       0.83       0.75      0.811       0.78       0.32       0.23       0.25

I would like to order the columns by increasing date, and keep the first 5 columns the same. Keep in mind that in my original dataset I have 30 initial columns to be kept the same.


Solution

  • If you wanted to use dplyr here is an alternative. Note each colname would have to be unique. In you df there were some duplicate ones

    library(dplyr)
    
    data1989 <- data.frame("date_fire" = c("1987-02-01", "1987-07-03", "1988-01-01"), 
                           "Foresttype" = c("oak", "pine", "oak"),
                           "meanSolarRad" = c(500, 550, 450),
                           "meanRainfall" = c(600, 300, 450),
                           "meanTemp" = c(14, 15, 12),
                           "1988.01.01" = c(0.5, 0.589, 0.66), 
                           "1986.06.03" = c(0.56, 0.447, 0.75), 
                           "1986.10.19" = c(0.8, NA, 0.83),
                           "1988.01.19" = c(0.75, 0.65,0.75), 
                           "1986.06.19" = c(0.1, 0.55,0.811),
                           "1987.10.19" = c(0.15, 0.12, 0.780),
                           # "1988.01.19" = c(0.2, 0.22,0.32),
                           # "1986.06.19" = c(0.18, 0.21,0.23),
                           # "1987.10.19" = c(0.21, 0.24, 0.250),
                           check.names = FALSE,
                           stringsAsFactors = FALSE) 
    
    # Sort date column names. replace 6 with first date column 
    sorted_colnames = sort(names(data1989)[6:ncol(data1989)])
    
    # Sort columns. Replace 5 with last non-date column
    data1989 %>% 
      select(1:5, sorted_colnames)