Search code examples
rtimestampposixctsubsetposixlt

differences between subsetting POSIXlt and POSIXct in R


DATA

v1 <- c("2015-01-05 15:00:00", "2015-01-05 15:45:00", "2015-01-05 15:00:30")

OPERATIONS

v2 <- strptime(v1, '%Y-%m-%d %H:%M:%S')
str(v2)
POSIXlt[1:3], format: "2015-01-05 15:00:00" "2015-01-05 15:45:00" "2015-01-05 15:00:30"

v3 <- v2[!v2$min]  # create v3 from v2 eliminating min != 00

RESULT (successful subsetting)

str(v3)
POSIXlt[1:2], format: "2015-01-05 15:00:00" "2015-01-05 15:00:30"

Now creating v4 by coercing v2 to POSIXct (successful)

v4 <- as.POSIXct(v2, format = "%y/%m/%d  %H:%M")

str(v4)
POSIXct[1:3], format: "2015-01-05 15:00:00" "2015-01-05 15:45:00" "2015-01-05 15:00:30"

OPERATION IN QUESTION - Applying the same subsetting operation to POSIXct as to POSIXlt causes the error below

v5 <- v4[!v4$min]  # reassign v2 eliminating min != 00

RESULT (error)

  Error in v4$min : $ operator is invalid for atomic vectors

QUESTIONS:
a) Why this difference in behavior?
b) What would be an equivalent operation to use with POSIXct?
Many thanks


Solution

  • You misunderstand a critical difference between POSIXlt and POSIXct:

    • POSIXlt is a 'list type' with components you can access as you do
    • POSIXct is a 'compact type' that is essentially just a number

    You almost always want POSIXct for comparison and effective storage (eg in a data.frame, or to index a zoo or xts object with) and can use POSIXlt to access components. Be warned, though, that the components follow C library standards so e.g. the current years is 115 (as you always need to add 1900), weekdays start at zero etc pp.

    Doing str() or unclass on these is illuminating. For historical reasons, strptime() returns a POSIXlt. I wish it would return a POSIXct.