I'm trying to calculate y values based on x values and a series of curves defined by a set of vertices. I'm pretty new to R, and couldn't find a direct way to do this. I've found a couple of methods, but I am looking for a better method to generate a formula I can then apply to a series of data frames, to use in further transformations. The curves and vertices are literature values, and I need to use them as they are (linear line segments going through each vertex), rather than interpreting a model based on the points and then adding predictions.
# Sample curve vertexes
Depth <- c(0,0.45,1.05,1.65,2.25,2.35,2.65,10,30)
Preference_depth_01 <- c(0,0,0.3,0.8,1,1,0.7,0.7,0)
# Example data
Depth <- c(0.00, 0.42, 0.6328287, 2.7463492, 3.6011860, 3.5307984, 2.8850018, 2.0481874, 0.9274444,0)
example <- data.frame(Depth,"Pref"=0)
I've tried two methods that work, but they are less than ideal, and I'm wondering if there is an accepted solution.
I can define a function by hand, then apply it to my data frames, but defining the function is clunky and error prone, and I need to define a dozen different curves, and apply each in turn to a series of data tables.
# Defining recoding function
pref_01 <- function(x){
val <- if (x <= .45){0
}else if (x <= 1.05){(.5 * x - .225)
}else if (x <= 1.65){(5 / 6 * x - .575)
}else if (x <= 2.25){(x / 3 + .25)
}else if (x <= 2.35){1
}else if (x <= 2.65){(-x + 3.35)
}else if (x <= 10){.7
}else if (x <= 30){-.035 * x + 1.05}
return(val)
}
# Applying function to example data
for(i in 1:length(example[, 1])){
example[[i, 2]] <- pref_01(example[[i, 1]])
}
Since the curves are linear splines, I tried to use lspline
from the lspline
package. This method is better, but when I add the predictions I get some negative values when I expect 0's.:
# Using linear spline method
library(lspline)
library(modelr)
spline_curve <- lm(Preference_depth_01 ~ lspline(Depth, knots = Depth[2:8])
,data = curve)
example <- add_predictions(data = example, model = spline_curve, var = "Spline")
example
# Depth Pref Spline
# 1 0.0000000 0.00000000 3.816839e-16
# 2 0.4200000 0.00000000 -5.794200e-17
# 3 0.6328287 0.09141435 9.141435e-02
# 4 2.7463492 0.70000000 7.000000e-01
# 5 3.6011860 0.70000000 7.000000e-01
# 6 3.5307984 0.70000000 7.000000e-01
# 7 2.8850018 0.70000000 7.000000e-01
# 8 2.0481874 0.93272913 9.327291e-01
# 9 0.9274444 0.23872220 2.387222e-01
# 10 0.0000000 0.00000000 3.816839e-16
My next step involves using the predictions in a geometric mean, and the negative values cause problems. I can get around this by rounding, but am wondering if there is a more elegant or accepted solution.
You can use approx
function to linearly interpolate the data between your data points please see as below for 1000 interpolated points:
Depth <- c(0, 0.45, 1.05, 1.65, 2.25, 2.35, 2.65, 10, 30)
Preference_depth_01 <- c(0, 0, 0.3, 0.8, 1, 1, 0.7, 0.7, 0)
plot(Depth, Preference_depth_01, pch = 19, cex = 1.5)
app <- approx(Depth, Preference_depth_01, n = 1000)
points(app, col = 2, pch = "o", cex = .5)
For further analyses you can use app$x
& app$y
, which are named x
and y components of list app
, respectiverly.