Search code examples
rstrptime

R parse timestamp of form %j%Y with no leading zeroes


I am working with csv timestamp data given in the form '%j%Y %H:%M with no leading zeroes. Here are some time stamp examples:

112005 22:00
1292005 6:00

R is reading the first line at the 112th day of the 005th year. How can I make R correctly parse this information?

Code I'm using which doesn't work:

train$TIMESTAMP <- strptime(train$TIMESTAMP, format='%j%Y %H:%M', tz='GMT')
train$hour <- as.numeric(format(train$TIMESTAMP, '%H'))

Solution

  • I don't think there's any simple way to decipher where the day stops and the year starts. Maybe you could split it at something that looks like a relevant year (20XX):

    gsub("^(\\d{1,3})(20\\d{2})","\\1 \\2",train$TIMESTAMP)
    #[1] "11 2005 22:00" "129 2005 6:00"
    

    and do:

    strptime(gsub("^(\\d{1,3})(20\\d{2})","\\1 \\2",train$TIMESTAMP), "%j %Y %H:%M")
    #[1] "2005-01-11 22:00:00 EST" "2005-05-09 06:00:00 EST"