Search code examples
pythonpandasnumpycovariancecovariance-matrix

How can I get a subcovariance from a covariance matrix in python


I have a covariance matrix (as a pandas DataFrame) in python as follows:

    a   b   c    
a   1   2   3    
b   2   10  4    
c   3   4   100

And I want dynamically to select only a subset of the covariance of matrix. For example a subset of A and C would look like

    a   c
a   1   3    
c   3   100

Is there any function that can select this subset?

Thank you!


Solution

  • If your covariance matrix is a numpy array like this:

    cov = np.array([[1, 2,  3],
                    [2, 10, 4],
                    [3, 4, 100]])
    

    Then you can get the desired submatrix by advanced indexing:

    subset = [0, 2]  # a, c
    
    cov[np.ix_(subset, subset)]
    
    # array([[  1,   3],
    #        [  3, 100]])
    

    Edit:

    If your covariance matrix is pandas DataFrame (e.g. obtained as cov = df.cov() for some dataframe df with columns 'a', 'b', 'c', ...), to get the subset of 'a' and 'c' you can do the following:

    cov.loc[['a','c'], ['a','c']]