I have values in the format "221559460" where first two is hour, other two is minute and following digits are in milliseconds. Is there any shortcut method to convert to time-series format without manually appending ":" in the value and calculating second from millisecond digits in R?
I think you cannot avoid some parsing here due to not having the "seconds" part, but it can be done fairly easily even with a simple regular expression:
> sub(pattern = '([0-9]{2})([0-9]{2})([0-9]{2})([0-9]*)',
+ replacement = '\\1:\\2:\\3.\\4',
+ x = 221559460)
[1] "22:15:59.460"
> strptime(
+ x = sub(pattern = '([0-9]{2})([0-9]{2})([0-9]{2})([0-9]*)',
+ replacement = '\\1:\\2:\\3.\\4',
+ x = 221559460),
+ format = '%H:%M:%OS')
[1] "2015-12-11 22:15:59 PST"
Updated answer based on below comment:
> options(digits.secs = 3)
> strptime(
+ x = sub(pattern = '([0-9]{8})_([0-9]{2})([0-9]{2})([0-9]{2})([0-9]*)',
+ replacement = '\\1 \\2:\\3:\\4.\\5',
+ x = '20150819_221559460'),
+ format = '%Y%m%d %H:%M:%OS')
[1] "2015-08-19 22:15:59.46 PDT"
Update for vectorized example:
> df <- data.frame(timestamp = paste('20150819', 221559460 + round(runif(10) * 100), sep = '_'))
> strptime(sub('([0-9]{3})$', '.\\1', df$timestamp),'%Y%m%d_%H%M%OS')
[1] "2015-08-19 22:15:59.517 PDT" "2015-08-19 22:15:59.550 PDT"
[3] "2015-08-19 22:15:59.538 PDT" "2015-08-19 22:15:59.493 PDT"
[5] "2015-08-19 22:15:59.484 PDT" "2015-08-19 22:15:59.549 PDT"
[7] "2015-08-19 22:15:59.510 PDT" "2015-08-19 22:15:59.462 PDT"
[9] "2015-08-19 22:15:59.466 PDT" "2015-08-19 22:15:59.474 PDT"