Search code examples
rdatekernel-density

Kernel Density Estimate in R for Vector of Dates


I have a vector of 195 date fields which I'm assigning to variable xvar as follows:

xvar <-as.Date(getvars[,3],"%m/%d/%Y") 

I want to be able to fit a distribution over the resulting histogram and subsequently sample from that probability distribution. I can plot the histogram and density of this date vector using ggplot2 but I'm not aware of a way to sample from the resulting density.

I downloaded the R package ks. It works wonderfully well for a vector of real numbers but when I pass it the date vector, after running these lines:

xvar <-as.Date(getvars[,3],"%m/%d/%Y") # Vector of Dates
xvnonull <- (na.omit(xvar)) #Remove any NAs
fhat <- kde(xvnonull) #Try to Fit KDE

I get an error stating:

"Error in rep(1, n) : invalid 'times' argument".

I have removed all NA values.

Do you have any suggestions on how to go about this problem? Are there alternate libraries/packages out there that will work with dates?


Solution

  • Here's how I would plot this (no bells & whistles):

    Data

    set.seed(1234)
    xvar<-
      sample(seq(from=as.Date("2015-01-01"),length.out=100L,by="day")
      )[colSums(matrix(sample(34L,300L,T),nrow=3))-2L]
    

    Solution

    library(ks)
    xvar_f<-as.factor(xvar)
    xvar_i<-as.integer(xvar_f)
    par(mar=c(5.6,4.1,4.1,2.1))
    plot(kde(xvar_i),xaxt="n",xlab="",ylab="",
         main="Density of Dates",las=1)
    tx=seq(min(xvar_i),max(xvar_i),by=5)
    lb=levels(xvar_f)[tx]
    axis(side=1,at=tx,labels=lb,las=2)
    

    Output

    enter image description here