Search code examples
pythonnumpymatrix-multiplication

Centering matrix


I want to write a function for centering an input data matrix by multiplying it with the centering matrix. The function shall subtract the row-wise mean from the input.

My code:

import numpy as np

def centering(data):
  n = data.shape()[0]
  centeringMatrix = np.identity(n) - 1/n * (np.ones(n) @ np.ones(n).T)
  data = centeringMatrix @ data


data = np.array([[1,2,3], [3,4,5]])
center_with_matrix(data)

But I get a wrong result matrix, it is not centered.

Thanks!


Solution

  • The centering matrix is

    np.eye(n) - np.ones((n, n)) / n
    

    Here is a list of issues in your original formulation:

    1. np.ones(n).T is the same as np.ones(n). The transpose of a 1D array is a no-op in numpy. If you want to turn a row vector into a column vector, add the dimension explicitly:

      np.ones((n, 1))
      

      OR

      np.ones(n)[:, None]
      
    2. The normal definition is to subtract the column-wise mean, not the row-wise, so you will have to transpose and right-multiply the input to get row-wise operation:

      n = data.shape()[1]
      ...
      data = (centeringMatrix @ data.T).T
      
    3. Your function creates a new array for the output but does not currently return anything. You can either return the result, or perform the assignment in-place:

      return (centeringMatrix @ data.T).T
      

      OR

      data[:] = (centeringMatrix @ data.T).T
      

      OR

      np.matmul(centeringMatrix, data.T, out=data.T)