python algorithm mathematical-optimization scipy-optimize scipy-optimize-minimize

Finding a fast optimization algorithm to solve a non-linear equation with unique positive solution

Goal:

Find a fast algorithm in Python that solves the function f(x) below for its positive solution.

def f(x):
    return (l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x

l, r, and x are vectors of dimension n and D is a matrix of dimension n x n.

The function is known to have a positive solution that is unique up to a scaling factor. I would like to solve the function for different data and different vector length n. The largest n is approximately 4000.

What I have tried so far:

I tried various scipy.optimize functions. First, I tried fsolve which does not seem appropriate because it sometimes gives a solution vector x with negative entries. Following the answers to a related question, I tried minimize and constraining the solution to positive numbers to avoid negative entries in the solution. Minimize finds the global minimum that solves the function only when provided with the correct solution as a starting value. When the starting value differs (slighlty), the resulting vector does not solve the equation (and I need the exact solution). I assume that the algorithm finds local minima but not the global one. To find the global minimum I tried differential evolution. Here, the problem is that it is very slow for any useful n. I did all testing with n = 5 for which it finds the correct solution.

Question:

Which algorithms are good candidates to solve this equation? (How) Can I use what I know about the equation to speed up the calculation? (i.e. a positive solution exists, it is unique up to scaling)

Minimal working example:

import numpy as np
from scipy.optimize import minimize, fsolve, differential_evolution

np.random.seed(1)

# Vector dimension
n = 250 # differential evolution is slow, better use n = 5 to test

# Data r and D
r = np.random.rand(n)
D = 1 + np.random.rand(n, n)

# True solution x
x_true = np.random.rand(n)

# Normalize x to have geometric mean 1
x_true = x_true / np.prod(x_true) ** (1/n)

# Solve for l implied by true x
l = ((np.tile(r, (n, 1)).transpose() / D) / (np.tile(x_true, (n, 1)) / D).sum(axis = 0)).sum(axis = 1) * x_true


### Fsolve

initial_guess_deviation_factor = 2
x_0 = x_true * np.random.uniform(low = 0.9, high = 1.1, size = n) ** initial_guess_deviation_factor

def f(x):
    return (l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x

# The solution is negative
x = fsolve(f, x_0)


### Minimize

def opt(x):
    return (((l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x) ** 2).sum()
 
def pos_constraint(x):
    return x

result = minimize(opt, x0=x_0, constraints={'type': 'ineq', 'fun':pos_constraint}, tol = 1e-18)

# The solution is different from the true solution
print(abs(result.x - x_true).mean())
print(result.fun)


### Differential evolution

def opt(x):
    return (((l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x) ** 2).sum()
 
# Since the solution is unique up to renormalization, I use bounds between 0 and 1 and renormalize after finding the solution
bounds = [(0, 1)] * n
result = differential_evolution(opt, bounds, seed=1)
result.x, result.fun

# Normalize solution
x_de = result.x / np.prod(result.x) ** (1/n)

print(abs(x_de - x_true).mean())
print(result.fun)

Solution

First, do some linear-algebraic analysis to derive the following simplified, equivalent form of your problem (and include regression tests):

import numpy as np
from numpy.random._generator import default_rng
from scipy.optimize import fsolve


rand = default_rng(seed=0)

n = 250  # differential evolution is slow, better use n = 5 to test
r = rand.random(n)
D = rand.uniform(low=1, high=2, size=(n, n))
rDD = r[:, np.newaxis] / (1/D).sum(axis=0) / D

# True solution x with geometric mean 1
x_true = rand.random(n)
x_true = x_true / x_true.prod() ** (1 / n)

# Solve for l implied by true x
l = rDD @ (1 / x_true) * x_true


def f(x: np.ndarray) -> np.ndarray:
    return l / (rDD @ (1 / x)) - x


def regression_test() -> None:
    assert l.shape == (250,)
    assert np.isclose(l[0], 1.8437187927094683)
    assert np.isclose(l.min(), 0.008011379562766192)
    assert np.isclose(l.max(), 4.870546152196413)
    assert np.isclose(l.sum(), 328.4768373368165)

    f_0 = f(x_0)
    assert f_0.shape == (250,)
    assert np.isclose(f_0[0], -0.11599601776615498)
    assert np.isclose(f_0.min(), -0.5897953289580671)
    assert np.isclose(f_0.max(), 0.3885530509026145)
    assert np.isclose(f_0.sum(), -9.253079363636402)


initial_guess_deviation_factor = 2
x_0 = x_true * rand.uniform(low=0.9, high=1.1, size=n) ** initial_guess_deviation_factor
regression_test()

Maybe I'm missing something, but your first attempt is very close at being both fast and accurate; you just need to divide out one term and the rest will be non-negative:

x = fsolve(f, x_0)
x /= x[0]
print(x)
print(f(x))

prints the following for x:

[1.         0.21849311 0.62362266 1.37331647 1.0126681  1.20579018 ...

and the following for f(x):

[ 1.18594023e-12 -2.77555756e-15 -1.30429001e-12 -7.40074668e-13 ...