Search code examples
rregressionpredictionpredict

Predict X value from Y value with a fitted 2-degree polynomial model


I have a dataset with the following format:

dataset1 = data.frame(
caliber = c("5000", "2500", "1250", "625", "312.5", "156", "80", "40", "20", "0"),
var1 = c(NA, NA, NA, 30458, 13740,11261, 9729, 5039, 3343, 367),
var2 = c(463000, 271903, 154611,87204, 47228, 28082, 14842, 8474, 5121, 1308),
var3 = c(308385, 184863, 89719, 48986, 27968, 18557, 9191, 5248, 3210, 703), 
var4 = c(290159, 149061, 64045, 36864, 19092, 12515, 6805, 3933, 2339, 574), 
var5 = c(270801, 163657, 51642, 48197, 23582, 14544, 7877, 4389, 2663, 482), 
var6 = c(NA, NA, NA, 37316, 21305, 11823, 5692, 3070, 1781, 363))

The best way to describe the relationship between the caliber and the other variables is by a 2-degree polynomial equation: var = poly(caliber, 2, raw=T)

enter image description here

My question is how I could use a new group of variables to identify the value of the caliber variable. As you can see below, I already have the results for each variable, but I need to identify the value of the caliber.

dataset2 = data.frame(
caliber = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
var1 = c(1120, 1296, 1132, 1280, 1096, 1124, 1004, 8384, 1072, 1104, 1568, 1044, 1108, 1012),
var2 = c(5044, 4924, 5088, 4804, 4824, 4844, 4964, 4788, 4804, 4964, 4824, 4788, 4844, 4944),
var3 = c(2836, 2744, 2744, 2668, 2688, 2940, 2756, 2720, 2668, 2892, 2636, 2700, 2836, 2668),
var4 = c(8872, 61580, 3036, 4468, 12132, 3000, 7920, 6868, 6896, 9392, 4728, 6896, 21076, 3228),
var5 = c(2312, 4236, 1928, 4448, 2388, 2108, 3644, 3060, 2168, 1912, 1812, 3528, 4100, 2176),
var6 = c(1156, 1228, 1224, 1364, 1128, 1176, 1184, 1640, 1188, 1300, 1332, 1176, 1176, 1152))

I am aware of a few previous threads on this topic, like

But none helped. Major issues were:

formula <- lm(var2~poly(caliber,2,raw=T), dataset1)
approx(x = formula$fitted, y = formula$caliber, xout = 0)$y

NA value for formula$caliber

mod<-lm(var2~poly(caliber, 2, raw=T), data=dataset1); summary(mod)
newdata=data.frame("var2"=dataset2[1:24,c("var2")])
pred<-predict(mod,newdata, type = 'response')

Error in poly(caliber, 2, coefs = list(alpha = c(998.35, 3691.21383929929 :object 'caliber' not found

unable to pass predict to another dataset

datasets with different rows

interpolation between X and Y gave wrong values


Solution

  • As per the discussions, what I have understood, I am providing you the following solution

    dataset1 = data.frame(
      caliber = c(5000, 2500, 1250, 625, 312.5, 156, 80, 40, 20, 0),
      var1 = c(NA, NA, NA, 30458, 13740,11261, 9729, 5039, 3343, 367),
      var2 = c(463000, 271903, 154611,87204, 47228, 28082, 14842, 8474, 5121, 1308),
      var3 = c(308385, 184863, 89719, 48986, 27968, 18557, 9191, 5248, 3210, 703), 
      var4 = c(290159, 149061, 64045, 36864, 19092, 12515, 6805, 3933, 2339, 574), 
      var5 = c(270801, 163657, 51642, 48197, 23582, 14544, 7877, 4389, 2663, 482), 
      var6 = c(NA, NA, NA, 37316, 21305, 11823, 5692, 3070, 1781, 363))
    
    formula <- lm(caliber ~ poly(var2, degree = 2, raw=T), dataset1)
    
    dataset2 = data.frame(
      caliber = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
      var1 = c(1120, 1296, 1132, 1280, 1096, 1124, 1004, 8384, 1072, 1104, 1568, 1044, 1108, 1012),
      var2 = c(5044, 4924, 5088, 4804, 4824, 4844, 4964, 4788, 4804, 4964, 4824, 4788, 4844, 4944),
      var3 = c(2836, 2744, 2744, 2668, 2688, 2940, 2756, 2720, 2668, 2892, 2636, 2700, 2836, 2668),
      var4 = c(8872, 61580, 3036, 4468, 12132, 3000, 7920, 6868, 6896, 9392, 4728, 6896, 21076, 3228),
      var5 = c(2312, 4236, 1928, 4448, 2388, 2108, 3644, 3060, 2168, 1912, 1812, 3528, 4100, 2176),
      var6 = c(1156, 1228, 1224, 1364, 1128, 1176, 1184, 1640, 1188, 1300, 1332, 1176, 1176, 1152))
    
    predict(formula, dataset2, type = 'response')
    

    The output from predict function will provide you with the values for caliber in dataset2.

    I have corrected your dataset1. If you put the values within double quotes, it becomes character. So, I have removed the double quotes from caliber variable.