Search code examples
rdataframeif-statementinterpolationlinear-interpolation

Computing values of each row of a data.frame using linear interpolation in R?


I would like to compute capacity corresponding to its Level of the ElevationData data.frame. I do have ElevationStorage data.frame where i have standard Elevation and Storage capacity curve data. In case any Level value from ElevationData equals to any Elevation value of ElevationStorage, i would use its corresponding Storage value otherwise i need to interpolate between Level and Elevation data and use an interpolated Storage value. Here is my incomplete code as i ran out of ideas to provide a complete one

library(tidyverse)
library(lubridate)

ElevationData <- data.frame(Date = seq(as.Date("2010-01-01"), to = as.Date("2010-01-31"), by = "days"),
                         Level = runif(31,364.1,364.7),
                         Outflow = runif(31,1,25))
ElvationStorage <- data.frame(Elevation = c(362.5,363.0,363.5,364.0,364.5,365.0,365.5,366.0),
                              Stroage = c(1000,2500,3500,4500,6000,7500,9000,10000))
ElevationData$Capacity <- if (ElevationData$Level == ElvationStorage$Elevation){
  Capacity = ElvationStorage$Stroage
} else
  # linear interpolation
  Capacity = 
  }

Output:

Here is a snapshot of the linear interpolation equation- i guess r has a built in function for this (approx()). enter image description here


Solution

  • You can use approxfun. It takes your two variables, which you can think of as "x" and "y" co-ordinates on a plot, and it generates a function for you. With this new function, you can input any "x" value to get the interpolated "y" value.

    In your case, that would look like this:

    f <- approxfun(ElvationStorage$Elevation, ElvationStorage$Stroage)
    
    f(ElevationData$Level)
    #>  [1] 5827.377 5370.168 6041.667 6488.930 6202.926 6227.891 6320.087 5044.925
    #>  [9] 5529.675 4840.971 5416.124 6416.492 5165.368 5044.048 5617.599 5963.217
    #> [17] 5223.595 6565.421 6121.812 5441.037 5426.820 5107.055 5481.322 5045.519
    #> [25] 5371.447 5122.873 6545.012 6377.317 4926.268 4958.893 6304.905
    

    So your data frame looks like this:

    ElevationData$Capacity <- f(ElevationData$Level)
    ElevationData
    #>          Date    Level   Outflow Capacity
    #> 1  2010-01-01 364.4425 23.387535 5827.377
    #> 2  2010-01-02 364.2901  9.249423 5370.168
    #> 3  2010-01-03 364.5139  8.596993 6041.667
    #> 4  2010-01-04 364.6630 16.188796 6488.930
    #> 5  2010-01-05 364.5676 11.367437 6202.926
    #> 6  2010-01-06 364.5760 21.487579 6227.891
    #> 7  2010-01-07 364.6067  9.117475 6320.087
    #> 8  2010-01-08 364.1816  1.534004 5044.925
    #> 9  2010-01-09 364.3432 13.664697 5529.675
    #> 10 2010-01-10 364.1137 10.116649 4840.971
    #> 11 2010-01-11 364.3054  3.352967 5416.124
    #> 12 2010-01-12 364.6388 18.502859 6416.492
    #> 13 2010-01-13 364.2218 21.086555 5165.368
    #> 14 2010-01-14 364.1813 11.706143 5044.048
    #> 15 2010-01-15 364.3725 12.581834 5617.599
    #> 16 2010-01-16 364.4877  9.467945 5963.217
    #> 17 2010-01-17 364.2412 22.229399 5223.595
    #> 18 2010-01-18 364.6885  5.427644 6565.421
    #> 19 2010-01-19 364.5406  3.447776 6121.812
    #> 20 2010-01-20 364.3137 19.334715 5441.037
    #> 21 2010-01-21 364.3089  3.248852 5426.820
    #> 22 2010-01-22 364.2024 11.173089 5107.055
    #> 23 2010-01-23 364.3271 12.973787 5481.322
    #> 24 2010-01-24 364.1818  5.006526 5045.519
    #> 25 2010-01-25 364.2905  9.306031 5371.447
    #> 26 2010-01-26 364.2076 14.297282 5122.873
    #> 27 2010-01-27 364.6817 20.449419 6545.012
    #> 28 2010-01-28 364.6258  7.465519 6377.317
    #> 29 2010-01-29 364.1421  2.480287 4926.268
    #> 30 2010-01-30 364.1530 11.092029 4958.893
    #> 31 2010-01-31 364.6016 19.747621 6304.905