Search code examples
rplottime-serieskernel-density

Temporal density plot in R


I have irregularly measured observations of a phenomenon with a timestamp each:

2013-01-03 00:04:23
2013-01-03 00:02:04
2013-01-02 23:45:16
2013-01-02 23:35:16
2013-01-02 23:31:56
2013-01-02 23:31:30
2013-01-02 23:29:18
2013-01-02 23:28:43
...

Now I would like to plot these points on the x axis and apply a kernel density function to them, so I can visually explore temporal density using various bandwidths. Something like this should turn out, although the example below does not use x axis labeling; I would like to have labels with, for example, particular days (January 1st, January 5th, etc.):

kernel density plot

It is important, however, that the measurement points themselves are visible in the plot, like above.


Solution

  • #dput
    df <- structure(list(V1 = structure(c(2L, 2L, 1L, 3L, 1L, 4L, 5L, 4L), .Label = c("2013-01-02", "2013-01-03", "2013-01-04", "2013-01-05", "2013-01-11"), class = "factor"), V2 = structure(c(1L, 3L, 8L,  4L, 7L, 6L, 5L, 2L), .Label = c(" 04:04:23", " 06:28:43", " 10:02:04", " 11:35:16", " 14:29:18", " 17:31:30", " 23:31:56", " 23:45:16"), class = "factor")), .Names = c("V1", "V2"), class = "data.frame", row.names = c(NA, -8L))
    

    Using ggplot since it gives fine-grained control over your plot. Use different layers for the measurements and the density itself.

    df$tcol<- as.POSIXct(paste(df$dte, df$timestmp), format= "%Y-%m-%d %H:%M:%S")
    library(ggplot2)
    measurements <- geom_point(aes(x=tcol, y=0), shape=15, color='blue', size=5)
    kde <- geom_density(aes(x=tcol), bw="nrd0")
    ggplot(df) + measurements +  kde
    

    Leads to enter image description here

    Now, if you want to further adjust the x-axis labels (since you want each separate day marked, you can use the scales package. We are going to use scale_x_date but that only takes in 'Date'

    library(scales)
    df$tcol <- as.Date(df$tcol, format= "%Y-%m-%d %H:%M:%S")
    xlabel <- scale_x_date(labels=date_format("%m-%d"), breaks="1 day")
    ggplot(df) + xlabel + measurements +  kde
    

    This gives: enter image description here

    Please note that the hours seem to have gotten rounded.

    Hopefully this helps you move forward.