algorithm math statistics mathematical-optimization

algorithm to generate numbers which have a given average product based on distance

I am trying to come up with an algorithm which generates numbers that have a given average product based on their distance. For example, say there were 4 points on the x-axis, x1 x2 x3 and x4. Each point has a position and a value. x1's position is x=1, x2's position is x=2, etc. So that means the distance between x1 and x2 is 1, while the distance from x2 and x4 is 2, etc.

I am looking for a method to generate the x values in a way so that their average product is equal to a given number. So for each distance we would have an equation

d=3: (1/2) * (n1*n4 + n4*n1) = avg1

d=2: (1/4) * (n1*n3 + n2*n4 + n3*n1 + n4*n2) = avg2

d=1: (1/6) * (n1*n2 + n2*n3 + n3*n4 + n2*n1 + n3*n2 + n4*n3) = avg3

d=0: (1/4) * (n1^2 + n2^2 + n3^2 + n4^2) = avg4

I tried by setting the avg values to small integers, then setting n1=1 and going from there. The resulting algebra left me with a system that doesn't have a solution. I have been using the distance matrix between the points to help visualize what is going on. I assume one has freedom to pick x1, like a seed, and would then try to derive the other points based on it.

Solution

The interior of this operation is non-linear. It could probably be represented better as a quadratic programming (QP) problem, but since those are not directly supported by scipy I demonstrate with a generic minimize.

You're best off pre-calculating a three-dimensional matrix of factors with dimensions for your distance, first n-index, and second n-index; and then during optimization using one einsum to reduce to a vector of means. These means can then be subtracted from the target means and a self-dot-product applied to get a least-squares cost.

import numpy as np
from scipy.optimize import minimize

N = 4


def make_factors() -> np.ndarray:
    factors = np.empty((N, N, N))

    for d in range(N):
        if d == 0:
            num = 1
        else:
            num = 0.5
        diag = np.full(shape=N - d, fill_value=num/(N - d))
        addend = np.diag(v=diag, k=d)
        factors[d, ...] = addend
        if d != 0:
            factors[d, ...] += addend.T

    return factors


def actual_mean(params: np.ndarray, factors: np.ndarray) -> np.ndarray:
    # Equivalent to factors[i, ...] dot outer(params, params)
    return np.einsum('ijk,j,k->i', factors, params, params)


def cost(params: np.ndarray, factors: np.ndarray, target_means: np.ndarray) -> float:
    error = actual_mean(params, factors) - target_means
    return error.dot(error)


def solve(factors: np.ndarray, target_means: np.ndarray) -> np.ndarray:
    '''
    d=1: 1/2(N-1) * (2(N-1)n^2) = mean1
    n = sqrt(mean1) for x0
    '''
    x0 = np.full(shape=N, fill_value=np.sqrt(target_means[1]))

    result = minimize(
        fun=cost,
        x0=x0,
        args=(factors, target_means),
    )
    if not result.success:
        raise ValueError(result.message)
    return result.x


def main() -> None:
    factors = make_factors()
    target_means = np.array((3.4, 3.7, 3.3, 2.5))
    fit = solve(factors, target_means)
    print('n =', fit)
    print(target_means, '~', actual_mean(fit, factors))


if __name__ == '__main__':
    main()

n = [1.57903356 2.09332332 2.09332332 1.57903356]
[3.4 3.7 3.3 2.5] ~ [3.43767474 3.66428601 3.30542776 2.49334698]