Search code examples
rlubridate

Extracting date and time from .wav filenames in r


I have a column of site_date_time.wav files which I need to parse the date and time segments of. Each file name is named like this: AF10_20160602_183000.wav.

I've used gsub(".wav", "", filename) to remove the .wav, and I've used as.Date(sub('[^_]+_(\\d+).*', '\\1', df[,5]), "%Y%m%d") to extract the date, but I can't seem to make this method work for the time segment.

I tried using sub('^[^_]+_[^_]+_(\\d{2})(\\d{2})_.*', '\\1:\\2', df[,5]) for the time to make it into a separate value to then use strptime, but it wouldn't work. I'm not quite sure what I'm missing.

If I could put date and time together into a single column that would be even better. Any suggestions would be helpful.


Solution

  • You can remove everything until first underscore with sub and then use as.POSIXct to convert to date-time value.

    x <- c('AF10_20160602_183000.wav', 'BF10_20160602_143000.wav')
    as.POSIXct(sub('.*?_', '', x), format = '%Y%m%d_%H%M%S.wav', tz = 'UTC')
    #[1] "2016-06-02 18:30:00 UTC" "2016-06-02 14:30:00 UTC"
    

    You can also use lubridate::ymd_hms :

    lubridate::ymd_hms(sub('.*?_', '', x))