Introduction
Let's say I have a set of experimental data, and need to find a polynomial approximation, which describe selected series. The experiment result depends on two variables - time and concentration. Let the exemplary data looks as follow:
Experiment=[1.5 0.2 0.4 0.4 0.2 0.2 2.0 0.2 0.4 0.4 0.2 0.2];
Time=[0 5 10 0 5 10 0 5 10 0 5 10];
Concentration=[0 0 0 1 1 1 2 2 2 3 3 3];
The polynomial can be easily fitted and ploted as follow:
Time = transpose(Time);
Concentration = transpose(Concentration);
Experiment= transpose(Experiment);
f1 = fit( [Time, Concentration], Experiment, 'poly23' );
pl=plot(f1, [Time, Concentration], Experiment);
The problem
The simple procedure described above is totally fine, and gives a polynomial graph: But when the time is 4-10 and concentration is under 1, the polynomial result is negative. The system i'm investigating is biological. So any negative values are physically not possible. So my questions is: How to set any boundaries/constraints to prevent resulting polynomial to be negative in the experiment range? How to force the MATLAB to give me approximation, which gives Z>0 always, if time is between 0 and 10, and concentration between 0 and 3?
For a nonlinear, constraint optimization using fmincon
, you first need to define a function that determines z
(i.e. predicts outcome for x
and y
):
function z = poly_model(x, y, p)
% extract parameters
p00 = p(1);
p10 = p(2);
p01 = p(3);
p20 = p(4);
p11 = p(5);
p02 = p(6);
p21 = p(7);
p12 = p(8);
p03 = p(9);
% poly23 model
z = p00 + p10 .* x + p01 .* y + p20 .* x.^2 + p11 .* x .* y + ...
p02 .* y.^2 + p21 .* x.^2 .* y + p12 .* x .* y.^2 + p03 .* y.^3;
end
Note that all multiplications and powers are element wise (.*
and .^
). This allows to evaluate the function for matrix inputs for x
and y
, which is needed to calculate the constraint you want to impose within the range of the experimental data.
The constraint has be defined in a separate function. From the docs:
Nonlinear constraints, specified as a function handle or function name. nonlcon is a function that accepts a vector or array x and returns two arrays, c(x) and ceq(x).
c(x) is the array of nonlinear inequality constraints at x. fmincon attempts to satisfy
c(x) <= 0 for all entries of c.
ceq(x) is the array of nonlinear equality constraints at x. fmincon attempts to satisfy
ceq(x) = 0 for all entries of ceq. So in your case, the constraint function can be defined as:
function [c, ceq] = constraint_eq(x, y, p)
% evaluate the model for required x and y
z_model = poly_model(x, y, p);
% and constrain z to be positive:
c = -z_model; % z_model >= 0, c(p) <= 0, hence c = -z_model
% no inequality constraint needed
ceq = [];
end
Next, you need to define an optimization function, which minimizes the error between the experimental data, and the prediction of the model:
function err = cost_function(x, y, z, p)
z_model = poly_model(x, y, p); % determine model prediction z for x and y
ev = z_model - z; % error vector
err = norm(ev, 2)^2; % sum of squared error
end
And finally, call the fmincon
routine:
clc
clear
close all
% data
Experiment = [1.5 0.2 0.4 0.4 0.2 0.2 2.0 0.2 0.4 0.4 0.2 0.2];
Time = [0 5 10 0 5 10 0 5 10 0 5 10];
Concentration = [0 0 0 1 1 1 2 2 2 3 3 3];
% short notation for readability
x = Time;
y = Concentration;
z = Experiment;
% define XV and YV to fulfil constraint over the entire x and y domain
xv = linspace(min(x), max(x), 20);
yv = linspace(min(y), max(y), 20);
[XV, YV] = meshgrid(xv, yv);
% initial guess parameters?
p0 = rand(9, 1);
p_final = fmincon(@(p) cost_function(x, y, z, p), p0, [], [], [], [], [], [], @(p) constraint_eq(XV, YV, p));
%% check result:
ZV = poly_model(XV, YV, p_final); % evaluate points in poly23 plane
% plot result
figure(1); clf;
scatter3(x, y, z, 200, 'b.');
hold on;
surf(XV, YV, ZV)
Influence of initial parameters p0
As @James Philips pointed out in the comment, you can also use the solution of the unconstrained optimization as a starting point for the constrained optimization. For the provided experimental data, and chosen model, you will see that there is not really a difference:
% The random initial guess:
p0 = rand(9, 1);
% Optimal solution for random p0
p_rand = fmincon(@(p) cost_function(x, y, z, p), p0, [], [], [], [], [], [], @(p) constraint_eq(XV, YV, p));
% first running unconstrained optimization and use p_unc
% as start point for constrained optimization
p_unc = fmincon(@(p) cost_function(x, y, z, p), p0, [], []);
p_con= fmincon(@(p) cost_function(x, y, z, p), p_unc, [], [], [], [], [], [], @(p) constraint_eq(XV, YV, p));
% Compare errors:
SSE_unc = cost_function(x,y,z,p_unc)
SSE_con = cost_function(x,y,z,p_con)
SSE_rand = cost_function(x,y,z,p_rand)
% compare poly23 parameters
p_all = [p_unc, p_con, p_rand]
Which will give:
SSE_unc =
1.0348
SSE_con =
1.1889
SSE_rand =
1.1889
p_all =
1.3375 1.2649 1.2652
-0.3425 -0.2617 -0.2618
-1.6069 -1.0620 -1.0625
0.0258 0.0187 0.0187
0.0175 -0.0018 -0.0016
1.5708 1.0717 1.0721
-0.0042 -0.0018 -0.0018
0.0125 0.0094 0.0094
-0.3722 -0.2627 -0.2628
In this case you see a very minor difference in the found parameters, but most likely the solver requires less iterations to get to this solution. By tweaking the solver settings (optimality tolerance and constraint tolerance) the solutions for p_rand
and p_con
will get closer.
In general it is however good practice to check multiple random initial guesses, to make sure that you are not finding a local minimum (for example by using MultiStart
).