Goal:
Find a fast algorithm in Python that solves the function f(x) below for its positive solution.
def f(x):
return (l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x
l, r, and x are vectors of dimension n and D is a matrix of dimension n x n.
The function is known to have a positive solution that is unique up to a scaling factor. I would like to solve the function for different data and different vector length n. The largest n is approximately 4000.
What I have tried so far:
I tried various scipy.optimize
functions. First, I tried fsolve
which does not seem appropriate because it sometimes gives a solution vector x with negative entries. Following the answers to a related question, I tried minimize
and constraining the solution to positive numbers to avoid negative entries in the solution. Minimize finds the global minimum that solves the function only when provided with the correct solution as a starting value. When the starting value differs (slighlty), the resulting vector does not solve the equation (and I need the exact solution). I assume that the algorithm finds local minima but not the global one. To find the global minimum I tried differential evolution
. Here, the problem is that it is very slow for any useful n. I did all testing with n = 5 for which it finds the correct solution.
Question:
Which algorithms are good candidates to solve this equation? (How) Can I use what I know about the equation to speed up the calculation? (i.e. a positive solution exists, it is unique up to scaling)
Minimal working example:
import numpy as np
from scipy.optimize import minimize, fsolve, differential_evolution
np.random.seed(1)
# Vector dimension
n = 250 # differential evolution is slow, better use n = 5 to test
# Data r and D
r = np.random.rand(n)
D = 1 + np.random.rand(n, n)
# True solution x
x_true = np.random.rand(n)
# Normalize x to have geometric mean 1
x_true = x_true / np.prod(x_true) ** (1/n)
# Solve for l implied by true x
l = ((np.tile(r, (n, 1)).transpose() / D) / (np.tile(x_true, (n, 1)) / D).sum(axis = 0)).sum(axis = 1) * x_true
### Fsolve
initial_guess_deviation_factor = 2
x_0 = x_true * np.random.uniform(low = 0.9, high = 1.1, size = n) ** initial_guess_deviation_factor
def f(x):
return (l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x
# The solution is negative
x = fsolve(f, x_0)
### Minimize
def opt(x):
return (((l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x) ** 2).sum()
def pos_constraint(x):
return x
result = minimize(opt, x0=x_0, constraints={'type': 'ineq', 'fun':pos_constraint}, tol = 1e-18)
# The solution is different from the true solution
print(abs(result.x - x_true).mean())
print(result.fun)
### Differential evolution
def opt(x):
return (((l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x) ** 2).sum()
# Since the solution is unique up to renormalization, I use bounds between 0 and 1 and renormalize after finding the solution
bounds = [(0, 1)] * n
result = differential_evolution(opt, bounds, seed=1)
result.x, result.fun
# Normalize solution
x_de = result.x / np.prod(result.x) ** (1/n)
print(abs(x_de - x_true).mean())
print(result.fun)
First, do some linear-algebraic analysis to derive the following simplified, equivalent form of your problem (and include regression tests):
import numpy as np
from numpy.random._generator import default_rng
from scipy.optimize import fsolve
rand = default_rng(seed=0)
n = 250 # differential evolution is slow, better use n = 5 to test
r = rand.random(n)
D = rand.uniform(low=1, high=2, size=(n, n))
rDD = r[:, np.newaxis] / (1/D).sum(axis=0) / D
# True solution x with geometric mean 1
x_true = rand.random(n)
x_true = x_true / x_true.prod() ** (1 / n)
# Solve for l implied by true x
l = rDD @ (1 / x_true) * x_true
def f(x: np.ndarray) -> np.ndarray:
return l / (rDD @ (1 / x)) - x
def regression_test() -> None:
assert l.shape == (250,)
assert np.isclose(l[0], 1.8437187927094683)
assert np.isclose(l.min(), 0.008011379562766192)
assert np.isclose(l.max(), 4.870546152196413)
assert np.isclose(l.sum(), 328.4768373368165)
f_0 = f(x_0)
assert f_0.shape == (250,)
assert np.isclose(f_0[0], -0.11599601776615498)
assert np.isclose(f_0.min(), -0.5897953289580671)
assert np.isclose(f_0.max(), 0.3885530509026145)
assert np.isclose(f_0.sum(), -9.253079363636402)
initial_guess_deviation_factor = 2
x_0 = x_true * rand.uniform(low=0.9, high=1.1, size=n) ** initial_guess_deviation_factor
regression_test()
Maybe I'm missing something, but your first attempt is very close at being both fast and accurate; you just need to divide out one term and the rest will be non-negative:
x = fsolve(f, x_0)
x /= x[0]
print(x)
print(f(x))
prints the following for x:
[1. 0.21849311 0.62362266 1.37331647 1.0126681 1.20579018 ...
and the following for f(x):
[ 1.18594023e-12 -2.77555756e-15 -1.30429001e-12 -7.40074668e-13 ...