I am trying to implement steepest descent algorithm in programming languages (C/C++/fortran).
For example minimization of f(x1,x2) = x1^3 + x2^3 - 2*x1*x2
Estimate starting design point x0, iteration counter k0, convergence parameter tolerence = 0.1. Say this staring point is (1,0)
Compute gradient of f(x1,x2) at the current point x(k) as grad(f). I will use numerical differentiation here.
d/dx1 (f) = lim (h->0) (f(x1+h,x2) - f(x1,x2) )/h
This is grad(f)=(3*x1^2 - 2*x2, 3*x2^2 - 2*x1)
grad(f) at (0,1) is c0 = (3,-2)
since L2 norm of c0 > tolerence, we proceed for next step
direction d0 = -c0 = (-3,2)
Calculate step size a. Minimize f(a) = f(x0 + ad0) = (1-3a,2a) = (1-3a)^3 + (2a)^3 - 2(1-3a)*(2a). I am not keeping constant step size.
update: new[x1,x2] = old[x1,x2]x + a*d0.
I do not understand how to do step 5. I have a 1D minimization program with bisection method, and it looks like:
program main()
...
...
define upper, lower interval
call function value
...calculations
...
...
function value (input x1in) (output xout)
...function is x^4 - 2x^2 + x + 10
xout = (xin)^4 - 2*(xin)^2 + (xin) + 10
In this case, looking at step 5, I cannot pass symbolic a. Any ideas how to implement the algorithm in programming language, especially step 5? Please suggest if there is altogether different way to program this. I have seen many programs with constant step size, but I want to compute it at every step. This algorithm can be easy to implement in MATLAB ot python sympy using symbolics, but I do not want to use symbolics. Any suggestions appreciated. Thanks.
If C++ is an option, you can take advantage of functors and lambdas.
Let's consider a function we want to minimize, for example y = x2 - x + 2. It can be represented as a function object, which is a class with an overloaded operator()
:
struct MyFunc {
double operator()( double x ) const {
return x * x - x + 2.0;
}
};
Now we can declare an object of this type, use it like a function and pass it to other templated function as a templated parameter.
// given this templated function:
template < typename F >
void tabulate_function( F func, double a, double b, int steps ) {
// the functor ^^^^^^ is passed to the templated function
double step = (b - a) / (steps - 1);
std::cout << " x f(x)\n------------------------\n";
for ( int i = 0; i < steps; ++i ) {
double x = a + i * step,
fx = func(x);
// ^^^^^^^ call the operator() of the functor
std::cout << std::fixed << std::setw(8) << std::setprecision(3) << x
<< std::scientific << std::setw(16) << std::setprecision(5)
<< fx << '\n';
}
}
// we can use the previous functor like this:
MyFunc example;
tabulate_function(example, 0.0, 2.0, 21);
OP's function can be implemented (given an helper class to represent 2D points) in a similar way:
struct MyFuncVec {
double operator()( const Point &p ) const {
return p.x * p.x * p.x + p.y * p.y * p.y - 2.0 * p.x * p.y;
}
};
The gradient of that function can be represented (given a class which implement a 2D vector) by:
struct MyFuncGradient {
Vector operator()( const Point &p ) {
return Vector(3.0 * p.x * p.x - 2.0 * p.y, 3.0 * p.y * p.y - 2.0 * p.x);
}
};
Now, the fifth step of OP question requests to minimize the first function along the direction of the gradient using a monodimensional optimization algorithm which requires a monodimensional function to be passed. We can solve this issue using a lambda:
MyFuncVec funcOP;
MyFuncGradient grad_funcOP;
Point p0(0.2, 0.8);
Vector g = grad_funcOP(p0);
// use a lambda to transform the OP function to 1D
auto sliced_func = [&funcOP, &p0, &g] ( double t ) -> double {
// those variables ^^^ ^^^ ^^ are captured and used
return funcOP(p0 - t * g);
};
tabulate_function(sliced_func, 0, 0.5, 21);
Live example HERE.