I am trying to write a function to calculate h-point. the function is defined over a rank frequency data frame. consider the following data.frame
:
DATA <-data.frame(frequency=c(64,58,54,32,29,29,25,17,17,15,12,12,10), rank=c(seq(1, 13)))
and the formula for h-point is :
if {there is an r = f(r), h-point = r } else { h-point = f(i)j-f(j)i / j-i+f(i)-f(j) } where f(i) and f(j) are corresponding frequencies for ith and jth ranks and i and j are adjacent ranks that i<f(i) and j>f(j).
this is what I`ve done so far:
h_point <- function(data){
x <- seq(nrow(data))
f_x <- data[["frequency"]][x]
h <- which(x == f_x)
if(length(h)>1) h
else{
i <- which(x < f_x)
j <- which(x > f_x)
s <- which(outer(i,j,"-") == -1, TRUE)
i <- i[s[,1]]
j <- j[s[,2]]
cat("i: ",i, "j: ", j,"\n")
f_x[i]*j - f_x[j]*i / (i-j + f_x[i]-f_x[j])
}
}
in DATA
, the h-point is 12
—because x = f_x. HOWEVER,
h_point(DATA)
i: j:
numeric(0)
what am I doing wrong here?
I had a look at your previous post how to calculate h-point but must say that I don't quite follow your method for calculating the h-point.
Based on the definition of the h-point I found
I think a simpler approach would be to use approxfun
to create a function frequency(rank), and then use uniroot
to find the h-point:
get_h_point <- function(DATA) {
fn_interp <- approxfun(DATA$rank, DATA$frequency)
fn_root <- function(x) fn_interp(x) - x
uniroot(fn_root, range(DATA$rank))$root
}
get_h_point(DATA)
#[1] 12