I am using the following function to estimate the Kernel density of some data
KDens = function(x,h,N) {
fx = matrix(0,N,1)
Kx = AK(x,h,N)
for (i in 1:N) {
fx[i] = sum(Kx[i,], na.rm=T)/N
}
return(fx) }
I know this is not the first question about speeding up a loop. I checked around in the site, I found that sometimes using some apply
function is faster, but this is not always the case if you manage to correctly set the loop.
In the above code, every "not needed thing" is left out of the loop, as - if I understood correctly - suggested to speed up computation. However, I made a comparison between the above KDens
function and the density
function implemented in R by default. Well, density
needs 1 or 2 seconds, while KDens
needs ~30 on my machine.
trywiththis <- rnorm(4800)
x = trywiththis
N = length(trywiththis)
h = 1.059*sd(trywiththis , na.rm=T)*(N^(-0.2))
EDIT: the information I provided was not complete
kerf = function(x){ return(dnorm(x)) }
ker = function(x,x0,h){
temp = kerf((x-x0)/h)/h
return(temp)
}
AK = function(x,h,N) {
K = array(0,c(N,N))
for (i in 1:N) {
for (j in 1:N) {
K[i,j] = ker(x[i],x[j],h)
}}
return(K) }
Suppose I want to speed up the KDens function, how could I do that ?
Try this... For your original 4800 length dataset it takes 2.5 seconds.
KDens2 = function(x,h,N) {
Kx <- outer( x , x , FUN = function(x,y) dnorm( ( x-y ) / h ) / h )
fx <- as.matrix( rowSums( Kx ) / N , ncol = 1 )
return( fx )
}
set.seed(1)
trywiththis <- rnorm(480)
x = trywiththis
N = length(trywiththis)
h = 1.059*sd(trywiththis , na.rm=T)*(N^(-0.2))
#Produce same result? (posibly not identical because of 'dnorm' function)
all.equal( KDens(x,h,N) , KDens2(x,h,N) )
[1] TRUE
#Rough timing on N=480 length data...
system.time( KDens2(x,h,N) )
# user system elapsed
# 0.01 0.00 0.02
system.time( KDens(x,h,N) )
# user system elapsed
# 2.7 0.0 2.7
And when N=4800
...
system.time( KDens2(x,h,N) )
user system elapsed
2.33 0.19 2.51