Search code examples
rdatexlsx

how to extract date in file name and sort it to find the latest file?


I currently have several files in a folder. It contains everyday updates on stock. It's looked like this.

Onhand Harian 12 Juli 2019.xlsx
Onhand Harian 13 Juli 2019.xlsx
Onhand Harian 14 Juli 2019.xlsx... and so on.

I would like to read ONLY the latest excel file by using the date on the file name. How to done this? thanx in advance


Solution

  • If all of your files contain the same name, you can do

    #List all the file names in the folder
    file_names <- list.files("/path/to/folder/", full.names = TRUE)
    
    #Remove all unwanted characters and keep only the date
    #Convert the date string to actual Date object
    #Sort them and take the latest file
    file_to_read <- file_names[order(as.Date(sub("Onhand Harian ", "", 
           sub(".xlsx$", "", basename(file_names))), "%d %B %Y"), decreasing = TRUE)[1]]
    

    Apparently, if your files are generated everyday you can also select them based on their creation or modification time using file.info ? Details in the post.