algorithm time-complexity complexity-theory mathematical-optimization

Unconstrained optimization methods and their convergence

I am a student in condensed matter physics. I have ran into many questions that requires a global optimization, such as find the most stable crystal form for a given components. These questions can be summarized as (NP-hard I believe) :

1) given a function f(x1,x2,x3,...) which is a total black box and sometimes expensive to calculate

2) given a region S which is discrete

3) find the global maximum of the function f

I know there are many algorithms such as evolutionary algorithms, particle swarm optimization, simulated annealing, and if f(x) is not very expensive, I can use some points to train a neuronal network, and meta-dynamics, but I want to know:

1) Is there more algorithms that can do the same thing?

2) What are the convergence and rate of convergence of these algorithms? I found some papers that says most of these algorithms converges, but how fast do they converge compared to brutal search over the entire space?

Thanks a lot for whoever can provide information or where to find information about the above problem

Solution

The general way the theory of hardness works is that if you can prove that the algorithm under study can be used to solve a known hard problem, then that algorithm cannot guarantee both correctness and tractable, polynomial, computer time (else we would have P=NP which is thought not to be the case, although there is no proof of this yet). There are now huge catalogs of NP-complete and NP-hard problems. For your purposes note in particular that https://en.wikipedia.org/wiki/Quadratic_programming#Complexity is one of them. It is also often fairly easy to take a combinatorial problem and come up with a continuous function that has a global minimum at the solution of the combinatorial problem, so again if you could find a global minimum in the continuous problem, you could solve the hard combinatorial problem.

The numerical analysis books are full of methods which take derivatives and converge quickly to a local optimum, given reasonable assumptions. There is also Torczon Simplex, which has a convergence proof for reasonable assumptions but doesn't require derivatives. (Reference in https://en.wikipedia.org/wiki/Pattern_search_(optimization), see also https://help.scilab.org/doc/5.5.2/en_US/optimsimplex_overview.html). The catch here is that convergence is proved to a local optimum, and there may be exponentially many local optima. Note that even SUM_i (X_i^2-1)^2 has exponentially many local optima, although they do all yield the same value. One idea is to repeatedly converge from random starting positions and pick the best answer found, but clearly the chance of finding the global optimum may be small.

Simulated annealing and its variants have a proof of convergence to the global optimum, but if you look at how it is proved, it boils down to saying that if you leave the program running long enough it will eventually stumble over the right answer, so the time to convergence grows exponentially with the problem size, and is of the same order (or worse) than repeatedly converging to a local optimum from multiple random starts. The assumptions sometimes made here - in particular that you are optimizing a random function - may also prove https://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization theorems, suggesting that these exotic approaches come with their own weaknesses, or at least that there is no one best all-purpose algorithm.