Search code examples
optimizationconstraintslinear-algebralinear-programmingequation-solving

Solving linear system of equations with constraints on unknowns


I would like to solve a system of linear equations y=Uh for an unknown vector h, where I have a few constraints on some of the elements of $h$. The matrix U is composed of a vector u (length N) and a delayed version of it.

Let us consider a small example: y is a vector of size N corrupted by noise, U is a matrix of size $N \times 9$, where:

the 1st column of U is the vector u;
the 2nd column is one time step delayed vector u i.e. [0; u(1:end-1)];
the 3rd column is two time step delayed vector u i.e. [0; 0; u(1:end-2)];
the 4th column is a square of individual elements of 1st column;
the 5th column is a square of individual elements of 2nd column;
the 6th column is a square of individual elements of the 3rd column;
the 7th column is the product of individual elements of columns 1 and 2;
the 8th column is the product of individual elements of columns 1 and 3;
the 9th column is the product of individual elements of columns 2 and 3;

and $h$ is a vector of size $9$, where $N>9$. Here, I have few constraints on the vector $h$. If the first three elements are $a, b, c$, then the next three elements should be $pa^2, pb^2, pc^2$, where $p$ is a non-zero scaling factor. The last three elements should then be $2pab, 2pac, 2pbc$. How do I use such constraints on parameters using another parameter of an unknown vector? Could someone help me with this?

The solution h = pinv(U)*y does not guarantee the desired structure of h. How to ensure that the solution has this structure and it best approximates the true noisy output y? Here the unknowns are a,b,c,p.


Solution

  • This is not a linear system; it's a nonlinear system. Also, you must almost definitely adopt a weaker definition of solve, because the system is over-determined and so can only be solved in a least-squares sense. There are only three unknowns (a, b, c), and N>9 equations. By example,

    import numpy as np
    from scipy.optimize import least_squares
    
    
    def make_h(abcp: np.ndarray) -> np.ndarray:
        abc, (p,) = np.split(abcp, (-1,))
        a, b, c = abc
    
        # Here, I have few constraints on the vector $h$.
        return np.concatenate((
            # If the first three elements are $a, b, c$,
            abc,
            # then the next three elements should be $pa^2, pb^2, pc^2$.
            p*abc**2,
            # The last three elements should then be $2pab, 2pac, 2pbc$.
            (
                2*p*a*b,
                2*p*a*c,
                2*p*b*c,
            ),
        ))
    
    
    def residuals(
        abcp: np.ndarray, U: np.ndarray, y: np.ndarray,
    ) -> np.ndarray:
        h = make_h(abcp)
    
        # I would like to solve a system of linear equations $y=Uh$ for an unknown vector $h$.
        yhat = U @ h
        return yhat - y
    
    
    def solve(
        U: np.ndarray, y: np.ndarray,
    ) -> tuple[np.ndarray, np.ndarray]:
        M = U.shape[1]//3
        N = U.shape[0]
        result = least_squares(
            fun=residuals,
            args=(U, y),
            x0=np.ones(M + 1),
        )
        if not result.success:
            raise ValueError(result.message)
    
        return make_h(result.x), result.x
    
    
    def main() -> None:
        rand = np.random.default_rng(seed=0)
        M = 3
        N = 20                        # $N>9$.
        u = rand.random(size=N)  # a vector u (length N)
        padded = np.concatenate((np.zeros(M - 1), u))
        delay = np.lib.stride_tricks.sliding_window_view(
            x=padded,
            window_shape=N,
        )[::-1, :]
        ua, ub, uc = delay
        U = np.vstack((
            delay, delay**2,
            ua*ub, ua*uc, ub*uc,
        )).T
    
        # Make some fake data that stand a chance of being solvable
        abcp_hidden = rand.random(size=M + 1)
        h_hidden = make_h(abcp_hidden)
        noise = rand.normal(size=N, scale=0.05)
    
        # $y$ is a vector of size $N$
        y = noise + U@h_hidden
    
        h, (*abc, p) = solve(U, y)
        print('h =', h)
        print('abc =', abc)
        print('p =', p)
    
    
    if __name__ == '__main__':
        main()
    
    h = [2.70105108e-02 1.47198194e-01 6.76129121e-01 4.22152723e-04
     1.25374428e-02 2.64522905e-01 4.60118054e-03 2.11347169e-02
     1.15177095e-01]
    abc = [0.02701051078001584, 0.14719819396169204, 0.6761291207038694]
    p = 0.5786340691198476