Search code examples
roptimizationleast-squares

Using optim to find minimization while also forcing the parameters to sum to 1


I am trying to use R and optim to calculate mixing proportions. So, for example, let's say I have a rock composition, 60% SiO2 and 40% CaO. I want to know how much of two different phases I have to mix to produce my rock. Let's say phase2 is 35% SiO2 and 65% CaO and Phase2 80% SiO2 and 20% CaO.

***Edits: I have updated the code to include a 3rd phase, and my attempt at using compositions package.I also have attempted to set bounds on the optim search range.

#Telling R the composition of both phases and the target rock

library(compositions)

Phase1 <- c(35, 65)
Phase2 <- c(80, 20)
Phase3 <- c(10, 90)
Target_Composition <- c(60, 40)

#My function to minimize
my.function <- function(n){
    n <- clo(n) #I though this would make 1 = n[1] + n[2] + n[3]
    (Target_Composition[1] - (n[1]*Phase1[1] + n[2]*Phase2[1] + n[3]*Phase3[1]))^2 + 
    (Target_Composition[2] - (n[1]*Phase1[2] + n[2]*Phase2[2] + n[3]*Phase3[2]))^2 
}

I then run this in optim using:

optim(c(.54,.46), my.function)

When I run this, I get 0.536 and 0.487, which is indeed a minimum, but, I neeed to set an additional parameter that n[1] + n[2] = 1 . Is there anyway to do this using optim? That's what's puzzling me. As a side note, the actual problem I want to solve has many more phases and each phase is more complicated, but, I will scale up once I get this part working.

I now use:

optim(c(.5, .4, .1), my.function, lower=0, upper=1, method="L-BFGS-B")

I now get 0.4434595, 0.4986680, and 0.2371141 as results, which do not sum to 1. How can I fix this? Also, Is my new method of setting the search range between 0 and 1 going to work?

Thanks for your help.


Solution

  • For one parameter estimation - optimize() function is used to minimize a function.

    For two or more parameters estimation, optim() function is used to minimize a function. There is another function in base R called constrOptim() which can be used to perform parameter estimation with inequality constraints. In your problem, you are intending to apply box constraints. The L-BFGS-B solver is the appropriate one.

    Note: not all solvers converge, but all of them terminate based on some conditions implemented in the solvers (you can control them using the control argument), so you have to check whether the minimum obtained from the solver passes KKT (Karush, Kuhn, and Tucker) conditions to ensure your solver had actually converged, while estimating the parameters optimally.

    KKT conditions

    1. for a minimum, the gradient ( first derivatives ) has to be “zero”
    2. for a minimum, the Hessian (second order partial derivatives ) has to be positive definite symmetric matrix.

    You can also use the determinant of the hessian matrix to find whether the minimum is local, global or saddle point.

    Here, I am showing two ways to create objective functions for the given expression, such as symbolic - algebraic form and matrix form. I also compare the outputs from both of these functions, however, I am using the matrix form objective function to show the optimization step. Notice that the optimization step satisfies the constraints you mentioned in your question.

    n1 + n2 + n3 = 1
    n1 = (-2, 2)
    n2 = (-1, 1)
    n3 = (-1, 1)
    

    You might want to try the library optimx which makes the job easier.

    # load library
    library('compositions')
    
    ## function for getting algebraic expressions 
    get_fexpr <- function( phase_len, n_ingredients ) # len = length of n or phase1 or phase2 or target composition
    {
      ## phase_len = number of phases (eg: phase1, phase2, phase3: = 3)
      ## n_ingredients = number of ingredients that form a composition (Eg: sio2 and cao: = 2)
      ## y = target composition
      ## x = phase data as c(phase1[1], phase2[1], phase1[2], phase2[2])
      ## n = parameters to be estimated
    
      p_n <- paste( rep("n", phase_len), 1:phase_len, sep = '')  # n
      p_x <- paste( rep("x", phase_len), paste( "[", 1:( phase_len * n_ingredients ), "]", sep = ''), sep = '' ) # x
      p_y <- paste( "y[", 1:n_ingredients, "]", sep = ""  )    # y
    
      # combine n, x, and y
      p_n_x <- paste( "(", paste( p_n, p_x, sep = "*" ), ")", sep = '')
      p_n_x <- lapply( split( p_n_x, ceiling( seq_along( p_n_x ) / phase_len ) ), paste, collapse = " + " )
      p_n_x <- lapply( p_n_x, function( x ) paste( "(", x, ")", sep = "" ) )
    
      # get deviations and sum of squares
      dev    <- mapply( paste, p_y, p_n_x, sep = " - " )  # deviations
      dev_sq <- paste( "(", dev, ")**2", sep = '', collapse = " + ")  # sum of squares of deviations
    
      return( dev_sq )
    }
    
    get_fexpr( phase_len = 1, n_ingredients = 1 )
    # [1] "(y[1] - ((n1*x[1])))**2"
    get_fexpr( phase_len = 1, n_ingredients = 2 )
    # [1] "(y[1] - ((n1*x[1])))**2 + (y[2] - ((n1*x[2])))**2"
    get_fexpr( phase_len = 2, n_ingredients = 2 )
    # [1] "(y[1] - ((n1*x[1]) + (n2*x[2])))**2 + (y[2] - ((n1*x[3]) + (n2*x[4])))**2"
    get_fexpr( phase_len = 3, n_ingredients = 2 )
    # [1] "(y[1] - ((n1*x[1]) + (n2*x[2]) + (n3*x[3])))**2 + (y[2] - ((n1*x[4]) + (n2*x[5]) + (n3*x[6])))**2"
    get_fexpr( phase_len = 4, n_ingredients = 2 )
    # [1] "(y[1] - ((n1*x[1]) + (n2*x[2]) + (n3*x[3]) + (n4*x[4])))**2 + (y[2] - ((n1*x[5]) + (n2*x[6]) + (n3*x[7]) + (n4*x[8])))**2"
    
    ## objective functions for max/min
    ## matrix form
    myfun1 <- function( n, phase_data, Target_Composition )
    {
      print(n)
      Target_Composition <- matrix(Target_Composition, ncol = length(Target_Composition), byrow = TRUE)
      dot_product <- n %*% phase_data   # get dot product
      sum((Target_Composition - dot_product)^2)
    }
    
    ## algebraic expression form
    myfun2 <- function( y, x, n, fexpr )
    {
      names(n) <- paste( "n", 1:length( n ), sep = "" ) # assign names to n
      list2env( as.list(n), envir = environment() )     # create new variables with a list of n
      return( eval( parse( text = fexpr ) ) )
    } 
    
    ## Comparison of functions of matrix and algebriac forms
    ## 1. raw data for two parameters estimation
    Target_Composition <- clo( c(60, 40), total = 1 )
    Phase1 <- clo( c(35, 65), total = 1 )
    Phase2 <- clo( c(80, 20), total = 1 )
    stopifnot( length( Phase1 ) == length( Phase2 ))  
    n      <- clo( c(0.54, 0.46) , total = 1 )
    
    ## data for matrix form
    phase_data_concat <- c(Phase1, Phase2)
    n_row <- length(phase_data_concat) / length(Phase1)
    n_col <- length(phase_data_concat) / n_row
    phase_data <- matrix( data = phase_data_concat, 
                          nrow = n_row, 
                          ncol = n_col, 
                          byrow = TRUE,
                          dimnames = list(c('phase1', 'phase2'), 
                                          c('sio2', 'cao')))
    
    ## data for algebraic form
    y <- Target_Composition
    x <- c(matrix( data = phase_data_concat, nrow = nrow( phase_data ), ncol = ncol( phase_data ), byrow = TRUE ))
    n <- n
    fexpr <- get_fexpr( phase_len = length( n ), n_ingredients = 2 )
    
    
    # compare algebraic form and matrix form functions
    myfun1(n, phase_data, Target_Composition)
    # [1] 0.0036979999999999969
    myfun2( y = y, x = x, n = n, fexpr = fexpr )
    # [1] 0.0036979999999999969
    
    ## 2. raw data for three parameters estimation
    Target_Composition <- clo( c(60, 40), total = 1)
    Phase1 <- clo( c(35, 65), total = 1)
    Phase2 <- clo( c(80, 20), total = 1)
    Phase3 <- clo( c(10, 90), total = 1)
    stopifnot( length( Phase1 ) == length( Phase2 ) && length( Phase1 ) == length( Phase3 ))  
    n <- clo( c(0.5, 0.4, 0.1) )   # starting guess for n1, n2, and n3
    
    ## data for matrix form
    phase_data_concat <- c(Phase1, Phase2, Phase3)
    n_row <- length(phase_data_concat) / length(Phase1)
    n_col <- length(phase_data_concat) / n_row
    phase_data <- matrix( data = phase_data_concat, 
                          nrow = n_row, 
                          ncol = n_col, 
                          byrow = TRUE,
                          dimnames = list(c('phase1', 'phase2', 'phase3'), 
                                          c('sio2', 'cao')))
    ## data for algebraic form
    y <- Target_Composition
    x <- c( matrix( phase_data_concat, nrow = nrow( phase_data), ncol = ncol(phase_data), byrow = TRUE ) )
    n <- n
    fexpr <- get_fexpr( phase_len = length( n ), n_ingredients = 2 )
    
    # compare algebraic form and matrix form functions
    myfun1(n, phase_data, Target_Composition)
    # [1] 0.01805
    myfun2( y = y, x = x, n = n, fexpr = fexpr )
    # [1] 0.01805
    
    
    ## Optimization using matrix form objective function (myfun1)
    # three parameter estimation
    phase_data
    #                       sio2                 cao
    # phase1 0.34999999999999998 0.65000000000000002
    # phase2 0.80000000000000004 0.20000000000000001
    # phase3 0.10000000000000001 0.90000000000000002
    
    # target data
    Target_Composition <- clo( c(60, 40), total = 1 )
    # [1] 0.59999999999999998 0.40000000000000002
    
    n <- c(0.5, 0.4, 0.1)
    # [1] 0.50000000000000000 0.40000000000000002 0.10000000000000001
    
    ## Harker diagram; also called scatterplot of two componenets without any transformation
    plot( phase_data, type = "b", main = "Harker Diagram" )   
    
    optim_model <- optim(par = n, 
                         fn = myfun1,
                         method = "L-BFGS-B", 
                         lower = c(-2, -1, -1),  # lower bounds: n1 = -2; n2 = -1; n3 = -1 
                         upper = c( 2, 1, 1 ),   # upper bounds: n1 = 2;  n2 = 1   
                         phase_data = phase_data,
                         Target_Composition = Target_Composition)
    
    # [1]  0.50000000000000000   0.40000000000000002   0.10000000000000001
    # [1]  0.50100000000000000   0.40000000000000002   0.10000000000000001
    # [1]  0.49900000000000000   0.40000000000000002   0.10000000000000001
    # [1]  0.50000000000000000   0.40100000000000002   0.10000000000000001
    # [1]  0.50000000000000000   0.39900000000000002   0.10000000000000001
    # [1]  0.50000000000000000   0.40000000000000002   0.10100000000000001
    # [1]  0.500000000000000000  0.400000000000000022  0.099000000000000005
    # [1]  0.443000000000007166  0.514000000000010004 -0.051999999999988944
    # [1]  0.444000000000007167  0.514000000000010004 -0.051999999999988944
    # [1]  0.442000000000007165  0.514000000000010004 -0.051999999999988944
    # [1]  0.443000000000007166  0.515000000000010005 -0.051999999999988944
    # [1]  0.443000000000007166  0.513000000000010004 -0.051999999999988944
    # [1]  0.443000000000007166  0.514000000000010004 -0.050999999999988943
    # [1]  0.443000000000007166  0.514000000000010004 -0.052999999999988945
    # [1]  0.479721654922847740  0.560497432130581008 -0.020709332414779191
    # [1]  0.480721654922847741  0.560497432130581008 -0.020709332414779191
    # [1]  0.478721654922847739  0.560497432130581008 -0.020709332414779191
    # [1]  0.479721654922847740  0.561497432130581009 -0.020709332414779191
    # [1]  0.479721654922847740  0.559497432130581007 -0.020709332414779191
    # [1]  0.47972165492284774   0.56049743213058101  -0.01970933241477919
    # [1]  0.479721654922847740  0.560497432130581008 -0.021709332414779191
    # [1]  0.474768384608834304  0.545177697419992557 -0.019903455841806163
    # [1]  0.475768384608834305  0.545177697419992557 -0.019903455841806163
    # [1]  0.473768384608834303  0.545177697419992557 -0.019903455841806163
    # [1]  0.474768384608834304  0.546177697419992558 -0.019903455841806163
    # [1]  0.474768384608834304  0.544177697419992557 -0.019903455841806163
    # [1]  0.474768384608834304  0.545177697419992557 -0.018903455841806163
    # [1]  0.474768384608834304  0.545177697419992557 -0.020903455841806164
    # [1]  0.474833910147636595  0.544703104470840138 -0.019537864476362268
    # [1]  0.475833910147636596  0.544703104470840138 -0.019537864476362268
    # [1]  0.473833910147636594  0.544703104470840138 -0.019537864476362268
    # [1]  0.474833910147636595  0.545703104470840139 -0.019537864476362268
    # [1]  0.474833910147636595  0.543703104470840137 -0.019537864476362268
    # [1]  0.474833910147636595  0.544703104470840138 -0.018537864476362267
    # [1]  0.474833910147636595  0.544703104470840138 -0.020537864476362269
    # [1]  0.474834452107517901  0.544702005703077585 -0.019536411001123268
    # [1]  0.475834452107517902  0.544702005703077585 -0.019536411001123268
    # [1]  0.473834452107517901  0.544702005703077585 -0.019536411001123268
    # [1]  0.474834452107517901  0.545702005703077586 -0.019536411001123268
    # [1]  0.474834452107517901  0.543702005703077584 -0.019536411001123268
    # [1]  0.474834452107517901  0.544702005703077585 -0.018536411001123267
    # [1]  0.474834452107517901  0.544702005703077585 -0.020536411001123269
    
    optim_model$par   # values of n after minimization of function my.function using starting guess of n1 = 0.54, n2 = 0.46
    # [1]  0.474834452107517901  0.544702005703077585 -0.019536411001123268
    
    sum(optim_model$par)
    # [1] 1.0000000468094723
    
    # different starting guess - n
    n <- c(0.2, 0.2, 0.6)
    
    optim_model <- optim(par = n, 
                         fn = myfun1,
                         method = "L-BFGS-B", 
                         lower = c(-2, -1, -1),  # lower bounds: n1 = -2; n2 = -1; n3 = -1 
                         upper = c( 2, 1, 1 ),   # upper bounds: n1 = 2;  n2 = 1   
                         phase_data = phase_data,
                         Target_Composition = Target_Composition)
    
    # [1] 0.20000000000000001 0.20000000000000001 0.59999999999999998
    # [1] 0.20100000000000001 0.20000000000000001 0.59999999999999998
    # [1] 0.19900000000000001 0.20000000000000001 0.59999999999999998
    # [1] 0.20000000000000001 0.20100000000000001 0.59999999999999998
    # [1] 0.20000000000000001 0.19900000000000001 0.59999999999999998
    # [1] 0.20000000000000001 0.20000000000000001 0.60099999999999998
    # [1] 0.20000000000000001 0.20000000000000001 0.59899999999999998
    # [1] 0.014000000000008284 0.571999999999969644 0.103999999999989656
    # [1] 0.015000000000008284 0.571999999999969644 0.103999999999989656
    # [1] 0.013000000000008283 0.571999999999969644 0.103999999999989656
    # [1] 0.014000000000008284 0.572999999999969645 0.103999999999989656
    # [1] 0.014000000000008284 0.570999999999969643 0.103999999999989656
    # [1] 0.014000000000008284 0.571999999999969644 0.104999999999989657
    # [1] 0.014000000000008284 0.571999999999969644 0.102999999999989655
    # [1] 0.13382855816928069 0.72372846274181657 0.20610638896226857
    # [1] 0.13482855816928069 0.72372846274181657 0.20610638896226857
    # [1] 0.13282855816928069 0.72372846274181657 0.20610638896226857
    # [1] 0.13382855816928069 0.72472846274181657 0.20610638896226857
    # [1] 0.13382855816928069 0.72272846274181657 0.20610638896226857
    # [1] 0.13382855816928069 0.72372846274181657 0.20710638896226857
    # [1] 0.13382855816928069 0.72372846274181657 0.20510638896226857
    # [1] 0.11766525503937687 0.67373774947575715 0.20873609146356592
    # [1] 0.11866525503937687 0.67373774947575715 0.20873609146356592
    # [1] 0.11666525503937687 0.67373774947575715 0.20873609146356592
    # [1] 0.11766525503937687 0.67473774947575715 0.20873609146356592
    # [1] 0.11766525503937687 0.67273774947575715 0.20873609146356592
    # [1] 0.11766525503937687 0.67373774947575715 0.20973609146356592
    # [1] 0.11766525503937687 0.67373774947575715 0.20773609146356592
    # [1] 0.11787907521862623 0.67218907774694359 0.20992907381396153
    # [1] 0.11887907521862623 0.67218907774694359 0.20992907381396153
    # [1] 0.11687907521862623 0.67218907774694359 0.20992907381396153
    # [1] 0.11787907521862623 0.67318907774694359 0.20992907381396153
    # [1] 0.11787907521862623 0.67118907774694359 0.20992907381396153
    # [1] 0.11787907521862623 0.67218907774694359 0.21092907381396153
    # [1] 0.11787907521862623 0.67218907774694359 0.20892907381396153
    # [1] 0.11788084371929158 0.67218549229424496 0.20993381673316230
    # [1] 0.11888084371929158 0.67218549229424496 0.20993381673316230
    # [1] 0.11688084371929158 0.67218549229424496 0.20993381673316230
    # [1] 0.11788084371929158 0.67318549229424496 0.20993381673316230
    # [1] 0.11788084371929158 0.67118549229424496 0.20993381673316230
    # [1] 0.11788084371929158 0.67218549229424496 0.21093381673316230
    # [1] 0.11788084371929158 0.67218549229424496 0.20893381673316230
    
    optim_model$par   # values of n after minimization of function my.function using starting guess of n1 = 0.54, n2 = 0.46
    # [1] 0.11788084371929158 0.67218549229424496 0.20993381673316230
    
    sum(optim_model$par)
    # [1] 1.0000001527466988
    

    Visualization: I modified myfun1 to myfun3 for visualizing the objective function using box constraints. I used two parameter estimation model.

    # modified function for visualization
    myfun3 <- function( n1, n2, phase_data, Target_Composition )
    {
      Target_Composition <- matrix(Target_Composition, ncol = length(Target_Composition), byrow = TRUE)
      dot_product <- c(n1, n2) %*% phase_data   # get dot product
      sum((Target_Composition - dot_product)^2)
    }
    myfun3 <- Vectorize(FUN = myfun3, vectorize.args = list( "n1", "n2"))
    
    Target_Composition <- clo( c(60, 40), total = 1 )
    Phase1 <- clo( c(35, 65), total = 1 )
    Phase2 <- clo( c(80, 20), total = 1 )
    stopifnot( length( Phase1 ) == length( Phase2 ))  
    n      <- clo( c(0.54, 0.46) , total = 1 )
    
    ## data for matrix form
    phase_data_concat <- c(Phase1, Phase2)
    n_row <- length(phase_data_concat) / length(Phase1)
    n_col <- length(phase_data_concat) / n_row
    phase_data <- matrix( data = phase_data_concat, 
                          nrow = n_row, 
                          ncol = n_col, 
                          byrow = TRUE,
                          dimnames = list(c('phase1', 'phase2'), 
                                          c('sio2', 'cao')))
    
    
    n1 = seq(-2, 2, 0.1)  # bounds for n1
    n2 = seq(-1, 1, 0.1)  # bounds for n2
    z <- outer( n1, n2, phase_data, Target_Composition, FUN = myfun3)
    persp(n1, n2, z,theta=150)
    

    enter image description here