Search code examples
rggplot2conditional-statementsparameterization

R: How to (efficiently) parameterize and draw (conditional) functions?


I have a data.frame were each row contains the parameter values for a function (i.e. one function per row). I would like to draw these functions. The functions are conditional, and should only be drawn for certain values, and have kinks (due to min/max levels). See example of what I am trying to archive: Conditional functions

I originally considered using curve() or stat_function (ggplot2-library). But I could not see how it would be possible to draw the curves only for certain values (see a, b, c), without generating a data.frame.

So I created a function that generates the plot data:

N = 10000;

PrisstrukturToPlotdata = function(s){
  # Create empty data.frame:
  A <- data.frame(Site=rep(s$Site, N), bid=1:N, Saelger=rep(NA, N), Koeber=rep(NA, N), stringsAsFactors=FALSE)
  # Fill out the data.frame:
  for (i in 1:N) {

      # Don't draw below:
      if(i > s$Mindste_bud*s$Kurs) {

        # First parenthesis is condition to insure we are above min, second parenthesis is in-between min and max, and third parenthesis is condition for above max:        
        A[i, ]$Saelger = s$Saelger_Fast_salaer*s$Kurs + i*s$Saelger_Andet_pct +
          (i*s$Saelger_Variable_salaer <= s$Saelger_Min_variable_salaer*s$Kurs) * 
            s$Saelger_Min_variable_salaer*s$Kurs +
          (i*s$Saelger_Variable_salaer > s$Saelger_Min_variable_salaer | (s$Saelger_Max_variable_salaer != 0 & i*s$Saelger_Variable_salaer < s$Saelger_Max_variable_salaer*s$Kurs)) *
            i*s$Saelger_Variable_salaer
          (s$Saelger_Max_variable_salaer != 0 & i*s$Saelger_Variable_salaer >= s$Saelger_Max_variable_salaer*s$Kurs) *
            s$Saelger_Max_variable_salaer*s$Kurs;

        A[i, ]$Koeber = s$Koeber_Fast_salaer*s$Kurs + i*s$Koeber_Variable_salaer;

      }
  }
  return(A)
}

library(plyr)
Plotdata = adply(Prisstruktur, 1, PrisstrukturToPlotdata, .expand = FALSE) 

Conditionality explained: There is a minimum value, below which the curve should not be drawn at all; if(i > s$Mindste_bud*s$Kurs)). Then there is a percentage i*s$Saelger_Variable_salaer with respectively a minimum and maximum level (to complicate things not all functions have a max, those without the max value is just 0). If the percentage is below the minimum, the minimum level should be used. If the percentage is above the max, then the maximum level should be used. In between the percentage should be used.

The script above works okay for N=100 or even N=1000, but when I go to N=10000 or above it takes ages to run. I am guessing this is due to all the conditional statements, but I am not sure how to do this in a more efficient manner?


Dummy data:

Site = c('A', 'B', 'C')
Mindste_bud = c(300, 0 , 0)
Saelger_Fast_salaer = c(0, 250, 2)
Saelger_Variable_salaer = c(0.12, 0.16, 0.10)
Saelger_Min_variable_salaer = c(250, 0, 0)
Saelger_Max_variable_salaer = c(0, 0, 250)
Saelger_Andet_pct = c(0, 0, 0)
Koeber_Fast_salaer = c(95, 0, 0)
Koeber_Variable_salaer = c(0.2, 0.25, 0)
Kurs = c(1, 1, 5.430)
Prisstruktur = cbind(Site, Mindste_bud, Saelger_Fast_salaer, Saelger_Variable_salaer, Saelger_Min_variable_salaer, Saelger_Max_variable_salaer, Saelger_Andet_pct, Koeber_Fast_salaer, Koeber_Variable_salaer, Kurs)

Solution

  • You don't need a loop in your function. I doubt that you need all N = 10000 data points to get a nice plot. I've added structure to your code by using more whitespace and some ifelse functions for clarity.

    PrisstrukturToPlotdata <- function(s, N = 10000, Length = 101)
      n <- seq(s$Mindste_bud * s$Kurs + 1, N, length = Length)
      data.frame(
        Bid = n,
        Saelger = 
          s$Saelger_Fast_salaer * s$Kurs + 
          n * s$Saelger_Andet_pct +
          ifelse(
            n * s$Saelger_Variable_salaer <= s$Saelger_Min_variable_salaer * s$Kurs,
            s$Saelger_Min_variable_salaer * s$Kurs,
            0
          ) +
          ifelse(
            n * s$Saelger_Variable_salaer > s$Saelger_Min_variable_salaer | 
              (s$Saelger_Max_variable_salaer != 0 & 
                 n * s$Saelger_Variable_salaer < s$Saelger_Max_variable_salaer * s$Kurs),
            n * s$Saelger_Variable_salaer,
            0
          ) +       
          ifelse(
            s$Saelger_Max_variable_salaer != 0 & n * s$Saelger_Variable_salaer >= s$Saelger_Max_variable_salaer * s$Kurs,
            s$Saelger_Max_variable_salaer * s$Kurs,
            0
          ),
        Koeber = s$Koeber_Fast_salaer * s$Kurs + n * s$Koeber_Variable_salaer
      )
    )